[Wikidata] Re: Introducing Nemo as a tool for Wikidata users 🎂

Markus Krötzsch Wed, 22 Oct 2025 07:26:37 -0700

I have now moved some of this information to wiki pages (for now in my user space, but could be moved):


A expandable collection of examples:
https://www.wikidata.org/wiki/User:Markus_Kr%C3%B6tzsch/Nemo_examples


The intro from my email:
https://www.wikidata.org/wiki/User:Markus_Kr%C3%B6tzsch/Nemo_for_Wikidata

Cheers,
Markus

On 22.10.25 07:40, Markus Krötzsch wrote:

TLDR: We present Nemo as a new Wikidata query tool that can answer
queries, extracts subsets, and perform analyses in ways that SPARQL
alone can't. It also lets you combine Wikidata with other data sources.


Dear all,

Nemo [1] is a graph rule engine that can be used to query and process
data (in many forms, online or offline). It's free and open source [2],
and there is a no-install Web application to use it:

https://tools.iccl.inf.tu-dresden.de/nemo/

As an early birthday present, we have just released Nemo v0.9, which
adds features that make Nemo a useful tool for working with Wikidata
content in new ways. This email is a short(ish) intro and teaser towards
this -- feedback is very welcome.


## What does Nemo do?

Think of it as an upgrade to the SPARQL query service, with the
following differences:

- You can do more powerful data transformations that would timeout in
SPARQL or not be possible at all
- You can use and combine data from multiple sources (Wikidata SPARQL
results, RDF, CSV, local files or online data)
- Processing in part happens on your computer, avoiding timeouts
- You can run Nemo in a browser (easy) or on the command line (for
heavier jobs)

Nemo still lets you focus on the data, hiding technicalities and
low-level issues. It's more than SPARQL, but much simpler than Python.


## How does that work?

You write "queries" -- or rather little "programs" -- in a simple
language based on if-then rules. Here is an example that uses no
external data at all:

https://tinyurl.com/2muju6sy (find common ancestors of two people)

Technically, this is a logic program in (a variant of) Datalog. Using a
few more Nemo features, you can use such rules with Wikidata content:

https://tinyurl.com/2mzfutcj (find common ancestors of Ada and Moby)

Btw you can share any Nemo program by sharing a link (the URL updates as
you type).


## Slow down, I never heard of "Datalog". How do I read this?

It's actually quite simple. Data is represented in "facts" such as
"father(Alice, Bob)", which we could use to say that Alice has father
Bob. A bit like triples in RDF/SPARQL, but you can have any number of
parameters (as in, say, "degree(Alice, MSc, Physics, 2025, TUDresden)").

Facts are used to compute new facts using rules like this:

uncle(?child, ?bro) :- parent(?child, ?p), brother(?p, ?bro) .

The ?... parts are variables, ":-" means "IF", and "," means "AND". So
the rule says:

?child has uncle ?bro IF
    ?child has a parent ?p  AND  ?p has a brother ?bro.

In a way, rules are like simple SPARQL query patterns, the result of
which you store as new facts. The power of Datalog is that you can use
these facts in future rule applications, producing more information step
by step rather than in one huge SPARQL query.


## Why not just use SPARQL?

The Ada/Moby example above can also be solved by a SPARQL query, though
the query will time out on WDQS. However, Nemo can also do things that
are outright impossible even with the most powerful SPARQL services.

The "Examples" button on the Web app shows some of the possibilities:

- Query for things that SPARQL cannot do in principle, such as the
longest winning streak of your favourite sports team ("Winning streaks
in sports")
- Combine third-party data with Wikidata on the fly ("Old trees", "CO2
emitting countries")
- Do multi-step analyses that would be very complex to express in SPARQL
("Empty classes in Wikidata")
- Directly query RDF data without a SPARQL service ("Wikipedia articles
vs. labels")


## What's behind it?

At its heart, Nemo is an in-memory data processing engine, written in
Rust. The data model is relational, but weakly typed (like RDF, CSV, and
JSON) rather than strongly typed (like SQL).

The Web app runs locally, in your browser. Your program and any local
data you might use (with "Add input files") will not be uploaded
anywhere [3]. Even in the browser, it is feasible to work with larger
files (millions of facts), but there are limits (don't try to import the
whole Wikidata dump there). For SPARQL, Nemo tries to optimise by
querying only for the values that your program needs. This is why some
of the examples can import from SPARQL queries like "?s ?p ?o" without
actually downloading all of Wikidata.

Nemo runs an extension of Datalog enriched with SPARQL-style datatypes
and "filter" functions, aggregates, and negation (both must be
stratified, i.e., used in non-recursive ways). As usual in Datalog, the
order of rules does not matter at all (although the examples are all
ordered following the "natural" processing pipeline). This
"declarativity" allows Nemo to automatically optimise rule applications
and data imports.

Some more academic documentation is found on our publication page:
https://github.com/knowsys/nemo/wiki/Publications


## Limitations? Future plans?

Loads (of both). Key limitations from a Wikidata perspective include
missing support for dates and geocoordinates (workaround: use SPARQL to
decompose these into several numbers). You might also find that more
data processing functions should be implemented (let us know). The web
app could benefit from richer result display and downloading options.

In the mid term, we plan to support more data formats, notably JSON, for
native import. We also look into programming features to structure
longer programs. However, we would also like to hear back from you to
decide where to go next.

We have a detailed handbook [4] but more Wikidata-related materials and
tutorials might be desirable. Again, let us know what you think.


Nemo is a university-based OSS project and still a prototype, so bear
with us if you discover bugs. We will try to answer your queries asap,
and we also have a public user chatroom [5]. Thanks are due to all
contributors [6], and for v0.9.0 especially to Alex Ivliev, Lukas
Gerlach, and Maximilian Marx.

Cheers,

Markus


[1] https://knowsys.github.io/nemo-doc/
[2] https://github.com/knowsys/nemo
[3] However, if you use Nemo with data from SPARQL, then some data might
be sent to the SPARQL endpoint (your SPARQL query for a start, but
possibly also specific data values your program needs data for).
[4] https://knowsys.github.io/nemo-doc/
[5] https://gitter.im/nemo/community or simply #nemo_community:gitter.im
[6] https://github.com/knowsys/nemo/graphs/contributors


--
Prof. Dr. Markus Kroetzsch
Knowledge-Based Systems Group
Faculty of Computer Science
TU Dresden
+49 351 463 38486
https://kbs.inf.tu-dresden.de/

_______________________________________________
Wikidata mailing list -- [email protected]
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/QCJCQAA7XDK34Y5S2U5BQMGDCQYFKEJG/
To unsubscribe send an email to [email protected]


--
Prof. Dr. Markus Kroetzsch
Knowledge-Based Systems Group
Faculty of Computer Science
TU Dresden
+49 351 463 38486
https://kbs.inf.tu-dresden.de/

smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
Wikidata mailing list -- [email protected]
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/TC2FWNVKPGDYVXCXQZ5AEOJPWBXXJNHG/
To unsubscribe send an email to [email protected]

[Wikidata] Re: Introducing Nemo as a tool for Wikidata users 🎂

Reply via email to