Lingro <http://creativecommons.org/weblog/entry/8341>

By Cameron Parkins on Weblog

Lingro <http://lingro.com/> is a project that aims to create an online
environment that will allow anyone, in reading a foreign language website, a
quick and easy means to translate words they don't understand. Simple in
concept, yet profound in implication, Lingro (which we have blogged about
twice <http://creativecommons.org/weblog/entry/7894>
before<http://creativecommons.org/weblog/entry/8168>)
uses open dictionaries and user-submitted, CC BY-SA
licensed<http://creativecommons.org/licenses/by-sa/3.0/>,
definitions to expand its ever-growing database. We recentlly caught up with
co-founder Paul Kastner and were able to discuss in-depth the philosophies
behind Lingro, how it accomplishes what it does, how it uses CC licenses,
and what its future holds.

*What is Lingro's history? How did it get started? Who is involved?*

The idea to create a new kind of on-line dictionary which would help people
learn languages was conceived by my co-founder, Artur Janc. A few years ago,
Artur was practicing his Spanish by reading Harry Potter y la piedra
filosofál<http://www.amazon.com/Harry-Potter-y-piedra-filosofal/dp/8478886540>.
He had taken all the advanced Spanish courses at the university where he was
studying, and like most students, had a good grasp on the grammar and core
vocabulary of the language. When he started reading, he found that while he
could understand the structure of the writing, there were so many words he
hadn't come across before that he was spending more time looking up words in
a dictionary than actually reading!

Artur thought there must be a better way, and built a prototype of what
would become Lingro, allowing him to look up words in a document he was
reading just by clicking on the word. This was a huge improvement in terms
of speed and reducing distraction compared to the usual method of looking up
words in a dictionary. He also built a flashcard game, which let him review
the words he looked up while reading after he was done. We realized that
this tool could be useful for other people as well, so we set out to build a
version that anybody could use, with many more languages than the original
English dictionary. We got a lot of help along the way from Holmes Wilson,
one of the people behind Miro <http://www.getmiro.com/>, the free
open-source video player as well as downhillbattle.org, promoting a fairer
music industry.

That's only half the story - since we launched in November 2007, there's
been a really great community developing around the site. Lots of people
have been contributing translations and expanding the dictionaries. We've
also gotten a ton of feedback on the tools, which has been invaluable as we
build and expand them. All the support we've gotten has been really
flattering, and the project definitely wouldn't be anything close to what it
is now without all of the dedicated people who have contributed.

*Lingro takes aim at a problem that has plagued those in web development for
some time - making their sites readable by people who don't speak the site's
native language. You have already developed functional dictionaries for a
bevy of different languages. What kind of tools does Lingro employ to tackle
this problem, both in pooling definitions and making these definitions
usable by both web-developers and site-visitors alike?*

There's a big gap in language learning between what's taught in a classroom
and what you need to know to use a language in day-to-day activities. Some
people will travel to a country where the language they're learning is
spoken in order to reach a sufficient level of fluency. Short of that, there
are surprisingly few ways for people to get support after they're done with
their coursework.

Lingro aims to fill this gap by giving everyone quick access to translations
while they're reading foreign-language web pages and documents, right at the
moment they need to know what a new word means. When you're using Lingro to
read a web page, you can click on any word in the text to bring up a
translation on the same page. This eliminates the need to move away from
what you're reading and go to a separate dictionary site, or thumb through a
paper dictionary.

When we started searching for content for Lingro, it was really important to
us from an ethical standpoint to use open dictionaries. Language is one of
the basic, common, and deeply necessary aspects of humanity, and to have the
core information about it controlled by a few large publishers seems wrong
on many levels. Especially in this age, where the growth of society is
driven by global interaction and cross-cultural communication, the means of
communicating across language barriers need to be as accessible as possible.

We set out to start assembling open dictionaries to include in Lingro, but
we found they were, well, a mess. A lot of people have done some really
great work building open dictionaries, but their efforts are scattered
across many different sites and projects. What's more, most of these
dictionaries aren't machine-readable. A big contributor to the success of
Creative Commons licenses has been making it easy to include
machine-readable data along with the work being licensed. This allows works
to be easily found through search engines by someone looking to reuse it.
Flickr is a great example of this - when you're searching for a particular
photo, you can specify your requirements for license terms, and the results
will show only those photos that fit your needs.

Imagine how hard it would be to do this kind of search if every flickr user
had a different way of specifying the licenses of their works. This is
pretty much the way it is with the open dictionaries out there. The way a
dictionary for one language pair, say German to English, encodes information
such as translation text, part of speech, noun gender, etc. is usually
completely different from the way another language pair does it. This makes
it nearly impossible for a project like Lingro (which would eventually like
to support translating from every language to every other language) to
incorporate dictionaries from multiple sources.

To overcome this, we've been writing software that takes in this mish-mosh
of different dictionary formats and puts out dictionaries in a clear,
simple, machine-readable format. We then load them into Lingro's back-end so
that people can access all the dictionaries through a common interface.

The process doesn't stop there. Once the dictionaries are loaded into
Lingro, we encourage users to contribute translations to continue expanding
the dictionaries. We've put a lot of effort into creating the Lingro
dictionary builder <http://lingro.com/builder/> which helps people easily
add translations and definitions. Once someone has chosen a language pair
they're fluent in, the builder shows them a list of words missing from that
particular dictionary, ordered by how common they are in the language (the
word "the" would be near the top, while "onomatopoeia" is further down).
They can also see sentences showing the words used in context to help recall
the meanings. These are the same kinds of tools used by the big publishers
to create their dictionaries - we're not just opening up the dictionaries
themselves, we're opening the entire process of creating them.

We've also created tools for webmasters of other sites that allow them to
directly access Lingro's dictionaries. Anyone can add Lingro's translation
pop-up translations to their pages, which is a really great way for sites
with a big international audience to make it easier on their readers. We've
also built a tiny search-as-you-type dictionary that webmasters can include
on their pages. These tools further Lingro's mission of making translations
as accessible as possible for as many people as we can.

*Part of Lingro's core is user submitted, CC BY-SA
licensed<http://creativecommons.org/licenses/by-sa/3.0/>,
word definitions. Why did you choose to go with CC licensing (and
specifically CC BY-SA)? Have you found CC licensing to be a good fit for
what Lingro is attempting to accomplish? How do CC-licensed definitions
compare to those pooled from other resources? Has there been any unique
instances or anecdotes you can think of that were enabled by CC-licensing?*

Just as the the formatting of all the open dictionaries is a mess, the
licensing landscape is just as convoluted. Most of the dictionaries out
there were started before the creation of Creative Commons licenses. Some
dictionaries, like Wiktionary, use the GNU Free Documentation License (FDL),
while others, such as the XDXF project, use the GNU General Public License
(GPL). Even worse, some have no formal license at all, and we've had to get
in touch with some of the authors to ask permission to include their
dictionaries in Lingro.

When we were starting out, we were fortunate enough to get in touch with
Lawrence Lessig (founder of Creative Commons and generally recognized as the
foremost expert on cyberlaw) about the problem. He recommended that we
dual-license all the new user contributions under the CC BY-SA license and
the GNU FDL. This allows us to contribute our user translations back to
existing projects like Wiktionary, while also making them available under
the much easier to understand terms of a Creative Commons license.

We especially like the CC BY-SA because it ensures that the content created
on Lingro will be free forever. Anyone building on the work our contributors
have done will be able to freely share it with the community in the same
way. This freedom is central to the creation of a commons of knowledge and
allowing people to collaborate across cultures.

Since we made the decision to use the CC BY-SA, there have been some really
exciting developments in the world of content licenses. Back in December,
Wikimedia (the parent organization of Wikipedia and Wiktionary) announced
that the board had passed a resolution to work with the Free Software
Foundation on updating the GNU FDL (used by Wikimedia) to allow for
migration of their content to the Creative Commons BY-SA. This is an
important step because the FDL was never designed for projects like
Wiktionary; it was written with the intent of licensing software manuals.
The CC BY-SA, in contrast, was designed for projects just like Wikipedia
(and Lingro as well), which have a strong emphasis on collaboration. The
move to the CC BY-SA license means that people will have a much easier time
knowing their rights and restrictions when reusing Wikipedia content.

That said, I've been somewhat disappointed with the lack of progress since
then. There was a good deal of fanfare surrounding the original
announcement, but it's been more than half a year since then and we've
barely heard another word about it from the organizations involved. Great
things can happen when organizations which share such similar philosophies
come together for a common goal, so I hope Wikimedia, the Free Software
Foundation, and Creative Commons all continue their collaboration to make
this a reality!

*Lingro always seems to be adding new features and functions to its already
long list of amenities. What is next for Lingro? Is there anything else
you'd like our readers to know?*

We're working on some really cool study tools which will help people review
the words they've translated. On the educational side of things, one of the
unique aspects of Lingro is the ability to provide a personalized learning
experience, which is really necessary at the more advanced levels of
language learning. Lingro keeps track of the words you look up while reading
so that after you've done with the web page or document, you can use study
tools, such as the flashcard game and sentence history page, to help you
review these words. We've got some big improvements and additions to this
part of the site planned for the coming months!

Another part of the project we're developing very actively are the tools for
webmasters and language learning
sites<http://lingro.com/docs/webmaster-tools.html>.
We're creating easy ways to include Lingro's translations both as
full-featured pop-ups as well as dictionary widgets. We're also making a
flexible dictionary API so that people can come up with new ways of reusing
the content. What we're shooting for is reaching a point where anyone
looking to include translation capabilities on their site can decide to use
open dictionaries not just because of ethical considerations, but because of
the high quality and ease of use of the dictionaries.

We're also working with volunteers to add more dictionaries to Lingro,
especially the widely-spoken languages such as Chinese. There's a lot of
political tension between the West and China right now, and as with all
international disputes, one of the keys to resolution is communication. By
bringing together tools and dictionaries to help people communicate across
language and cultural boundaries, we think Lingro is able to make the world
a better place for all of its inhabitants.

If you would like to get involved with the project, please feel free to
e-mail me at *paul AT lingro DOT com*. We're always looking for more
volunteers to help expand the range and depth of the dictionaries available
through Lingro.



On Fri, Jul 4, 2008 at 5:39 PM, Günther Osswald <
[EMAIL PROTECTED]> wrote:

>
> Hi all,
> hi Leo, Seth
>
> Thanks for your feedbacks! The problem with volunteers is, that they
> don't have them. My proposal to engage their own students for
> wikifying was rejected, because working in a wiki is not what they
> teach at this department. It is even new to the teachers themselves.
> I think what is needed is good software, that does the job, as all
> material is already on-line in HTML. As the MediaWiki wikis can read
> HTML, the question for us is, if we would we be ready to accept 1000
> HTML-pages in WE. ( I remember I once put in one HTML-page, and Wayne
> finally wikified it for me, because he argued we should use wiki
> syntax, so that everyone would be able to deal with the source code
> and modify the text.)
> For translation, Prof. Wiesner would never be satisfied with a machine
> translation. The material is of high quality and should stay of high
> quality.
> Leo, the saying with the fish is absolutely right, but I think the LMU-
> people will only release their material, if they can convert it into a
> form that is ready for use, so to speak into a fish-can.
>
> Best regards,
>
> Günther
>
> http://www.wikieducator.org/User:White_Eagle
> >
>


-- 
--
Leigh Blackall
+64(0)21736539
skype - leigh_blackall
SL - Leroy Goalpost
http://learnonline.wordpress.com

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "WikiEducator" group.
To visit wikieducator: http://www.wikieducator.org
To visit the discussion forum: http://groups.google.com/group/wikieducator
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[EMAIL PROTECTED]
-~----------~----~----~----~------~----~------~--~---

Reply via email to