Re: [Wikidata-l] Accelerating software innovation with Wikidata and improved Wikicode

Michael Hale Mon, 08 Jul 2013 13:11:42 -0700

All positive change is gradual. In the meantime, for those of us with ample 
free time for coding, it'd be nice to have a place to check in code and unit 
tests that are organized roughly in the same way as Wikipedia. Maybe such a 
project already exists and I just haven't found it yet.
 
> Date: Mon, 8 Jul 2013 23:01:50 +0300
> From: marty...@graphity.org
> To: wikidata-l@lists.wikimedia.org
> Subject: Re: [Wikidata-l] Accelerating software innovation with Wikidata and 
> improved Wikicode
> 
> Yes, that is one of the reasons functional languages are getting popular:
> https://www.fpcomplete.com/blog/2012/04/the-downfall-of-imperative-programming
> With PHP and JavaScript being the most widespread (and still misused)
> languages we will not get there soon, however.
> 
> On Mon, Jul 8, 2013 at 10:57 PM, Michael Hale <hale.michael...@live.com> 
> wrote:
> > In the functional programming language family (think Lisp) there is no
> > fundamental distinction between code and data.
> >
> >> Date: Mon, 8 Jul 2013 22:47:46 +0300
> >> From: marty...@graphity.org
> >
> >> To: wikidata-l@lists.wikimedia.org
> >> Subject: Re: [Wikidata-l] Accelerating software innovation with Wikidata
> >> and improved Wikicode
> >>
> >> Here's my approach to software code problems: we need less of it, not
> >> more. We need to remove domain logic from source code and move it into
> >> data, which can be managed and on which UI can be built.
> >> In that way we can build generic scalable software agents. That is the
> >> way to Semantic Web.
> >>
> >> Martynas
> >> graphityhq.com
> >>
> >> On Mon, Jul 8, 2013 at 10:13 PM, Michael Hale <hale.michael...@live.com>
> >> wrote:
> >> > There are lots of code snippets scattered around the internet, but most
> >> > of
> >> > them can't be wired together in a simple flowchart manner. If you look
> >> > at
> >> > object libraries that are designed specifically for that purpose, like
> >> > Modelica, you can do all sorts of neat engineering tasks like simulate
> >> > the
> >> > thermodynamics and power usage of a new refrigerator design. Then if
> >> > your
> >> > company is designing a new insulation material you would make a new
> >> > "block"
> >> > with the experimentally determined properties of your material to
> >> > include in
> >> > the programmatic flowchart to quickly calibrate other aspects of the
> >> > refrigerator's design. To my understanding, Modelica is as big and good
> >> > as
> >> > it gets for code libraries that represent physically accurate objects.
> >> > Often, the visual representation of those objects needs to be handled
> >> > separately. As far as general purpose, standard programming libraries
> >> > go,
> >> > Mathematica is the best one I've found for quickly prototyping new
> >> > functionality. A typical "web mashup" app or site will combine
> >> > functionality
> >> > and/or data from 3 to 6 APIs. Mobile apps will typically use the phone's
> >> > functionality, an extra library for better graphics support, a
> >> > proprietary
> >> > library or two made by the company, and a couple of web APIs. A similar
> >> > story for desktop media-editing programs, business software, and
> >> > high-end
> >> > games except the libraries are often larger. But there aren't many
> >> > software
> >> > libraries that I would describe as huge. And there are even fewer that
> >> > manage to scale the usefulness of the library equally with the size it
> >> > occupies on disk.
> >> >
> >> > Platform fragmentation (increase in number and popularity of smart
> >> > phones
> >> > and tablets) has proven to be a tremendous challenge for continuing to
> >> > improve libraries. I now just have 15 different ways to draw a circle on
> >> > different screens. The attempts to provide virtual machines with
> >> > write-once
> >> > run-anywhere functionality (Java and .NET) have failed, often due to
> >> > customer lock-in reasons as much as platform fragmentation. Flash isn't
> >> > designed to grow much beyond its current scope. The web standards can
> >> > only
> >> > progress as quickly as the least common denominator of functionality
> >> > provided by other means, which is better than nothing I suppose.
> >> > Mathematica
> >> > has continued to improve their library (that's essentially what they
> >> > sell),
> >> > but they don't try to cover a lot of platforms. They also aren't open
> >> > source
> >> > and don't attempt to make the entire encyclopedia interactive and
> >> > programmable. Open source attempts like the Boost C++ library don't seem
> >> > to
> >> > grow very quickly. But I think using Wikipedia articles as a scaffold
> >> > for a
> >> > massive open source, object-oriented library might be what is needed.
> >> >
> >> > I have a few approaches I use to decide what code to write next. They
> >> > can be
> >> > arranged from most useful as an exercise to stay sharp in the long term
> >> > to
> >> > most immediately useful for a specific project. Sometimes I just write
> >> > code
> >> > in a vacuum. Like, I will just choose a simple task like making a 2D
> >> > ball
> >> > bounce around some stairs interactively and I will just spend a few
> >> > hours
> >> > writing it and rewriting it to be more efficient and easier to expand.
> >> > It
> >> > always gives me a greater appreciation for the types of details that can
> >> > be
> >> > specified to a computer (and hence the scope of the computational
> >> > universe,
> >> > or space of all computer programs). Like with the ball bouncing example
> >> > you
> >> > can get lost defining interesting options for the ball and the ground or
> >> > in
> >> > the geometry logic for calculating the intersections (like if the ball
> >> > doesn't deform or if the stairs have certain constraints on their shape
> >> > there are optimizations you can make). At the end of the exercise I
> >> > still
> >> > just have a ball bouncing down some stairs, but my mind feels like it
> >> > has
> >> > been on a journey. Sometimes I try to write code that I think a group of
> >> > people would find useful. I will browse the articles in the areas of
> >> > computer science category by popularity and start writing the first
> >> > things I
> >> > see that aren't already in the libraries I use. So I'll expand
> >> > Mathematica's
> >> > FindClusters function to support density based methods or I'll expand
> >> > the
> >> > RandomSample function to support files that are too large to fit in
> >> > memory
> >> > with a reservoir sampling algorithm. Finally, I write code for specific
> >> > projects. I'm trying to genetically engineer turf grass that doesn't
> >> > need to
> >> > be cut, so I need to automate some of the work I do for GenBank imports
> >> > and
> >> > sequence comparisons. For all of those, if there was an organized place
> >> > to
> >> > put my code afterwards so it would fit into a larger useful library I
> >> > would
> >> > totally be willing to do a little bit of gluing work to help fit it all
> >> > together.
> >> >
> >> >> Date: Mon, 8 Jul 2013 19:13:54 +0200
> >> >> From: jane...@gmail.com
> >> >> To: wikidata-l@lists.wikimedia.org
> >> >
> >> >> Subject: Re: [Wikidata-l] Accelerating software innovation with
> >> >> Wikidata
> >> >> and improved Wikicode
> >> >>
> >> >> I am all for a "dictionary of code snippets", but as with all
> >> >> dictionaries, you need a way to group them, either by alphabetical
> >> >> order or "birth date". It sounds like you have an idea how to group
> >> >> those code samples, so why don't you share it? I would love to build
> >> >> my own "pipeline" from a series of algorithms that someone else
> >> >> published for me to reuse. I am also for more sharing of datacentric
> >> >> programs, but where would the data be stored? Wikidata is for data
> >> >> that can be used by Wikipedia, not by other projects, though maybe
> >> >> someday we will find the need to put actual weather measurements in
> >> >> Wikidata for some oddball Wikisource project tp do with the history of
> >> >> global warming or something like that.
> >> >>
> >> >> I just don't quite see how your idea would translate in the
> >> >> Wiki(p/m)edia world into a project that could be indexed.
> >> >>
> >> >> But then I never felt the need for "high-fidelity simulations of
> >> >> virtual worlds" either.
> >> >>
> >> >> 2013/7/6, Michael Hale <hale.michael...@live.com>:
> >> >> > I have been pondering this for some time, and I would like some
> >> >> > feedback. I
> >> >> > figure there are many programmers on this list, but I think others
> >> >> > might
> >> >> > find it interesting as well.
> >> >> > Are you satisfied with our progress in increasing software
> >> >> > sophistication as
> >> >> > compared to, say, increasing the size of datacenters? Personally, I
> >> >> > think
> >> >> > there is still too much "reinventing the wheel" going on, and the
> >> >> > best
> >> >> > way
> >> >> > to get to software that is complex enough to do things like
> >> >> > high-fidelity
> >> >> > simulations of virtual worlds is to essentially crowd-source the
> >> >> > translation
> >> >> > of Wikipedia into code. The existing structure of the Wikipedia
> >> >> > articles
> >> >> > would serve as a scaffold for a large, consistently designed,
> >> >> > open-source
> >> >> > software library. Then, whether I was making software for weather
> >> >> > prediction
> >> >> > and I needed code to slowly simulate physically accurate clouds or I
> >> >> > was
> >> >> > making a game and I needed code to quickly draw stylized clouds I
> >> >> > could
> >> >> > just
> >> >> > go to the article for clouds, click on C++ (or whatever programming
> >> >> > language
> >> >> > is appropriate) and then find some useful chunks of code. Every
> >> >> > article
> >> >> > could link to useful algorithms, data structures, and interface
> >> >> > designs
> >> >> > that
> >> >> > are relevant to the subject of the article. You could also find
> >> >> > data-centric
> >> >> > programs too. Like, maybe a JavaScript weather statistics browser and
> >> >> > visualizer that accesses Wikidata. The big advantage would be that
> >> >> > constraining the design of the library to the structure of Wikipedia
> >> >> > would
> >> >> > handle the encapsulation and modularity aspects of the software
> >> >> > engineering
> >> >> > so that the components could improve independently. Creating a
> >> >> > simulation or
> >> >> > visualization where you zoom in from a whole cloud to see its
> >> >> > constituent
> >> >> > microscopic particles is certainly doable right now, but it would be
> >> >> > a
> >> >> > lot
> >> >> > easier with a function library like this.
> >> >> > If you look at the existing Wikicode and Rosetta Code the code
> >> >> > samples
> >> >> > are
> >> >> > small and isolated. They will show, for example, how to open a file
> >> >> > in
> >> >> > 10
> >> >> > different languages. However, the search engines already do a great
> >> >> > job
> >> >> > of
> >> >> > helping us find those types of code samples across blog posts of
> >> >> > people
> >> >> > who
> >> >> > have had to do that specific task before. However, a problem that I
> >> >> > run
> >> >> > into
> >> >> > frequently that the search engines don't help me solve is if I read a
> >> >> > nanoelectronics paper and I want to do a simulation of the physical
> >> >> > system
> >> >> > they describe I often have to go to the websites of several different
> >> >> > professors and do a fair bit of manual work to assemble their
> >> >> > different
> >> >> > programs into a pipeline, and then the result of my hacking is not
> >> >> > easy
> >> >> > to
> >> >> > expand to new scenarios. We've made enough progress on Wikipedia that
> >> >> > I
> >> >> > can
> >> >> > often just click on a couple of articles to get an understanding of
> >> >> > the
> >> >> > paper, but if I want to experiment with the ideas in a software
> >> >> > context
> >> >> > I
> >> >> > have to do a lot of scavenging and gluing.
> >> >> > I'm not yet convinced that this could work. Maybe Wikipedia works so
> >> >> > well
> >> >> > because the internet reached a point where there was so much
> >> >> > redundant
> >> >> > knowledge listed in many places that there was immense social and
> >> >> > economic
> >> >> > pressure to utilize knowledgeable people to summarize it in a free
> >> >> > encyclopedia. Maybe the total amount of software that has been
> >> >> > written
> >> >> > is
> >> >> > still too small, there are still too few programmers, and it's still
> >> >> > too
> >> >> > difficult compared to writing natural languages for the crowdsourcing
> >> >> > dynamics to work. There have been a lot of successful open-source
> >> >> > software
> >> >> > projects of course, but most of them are focused on creating software
> >> >> > for a
> >> >> > specific task instead of library components that cover all of the
> >> >> > knowledge
> >> >> > in the encyclopedia.
> >> >>
> >> >> _______________________________________________
> >> >> Wikidata-l mailing list
> >> >> Wikidata-l@lists.wikimedia.org
> >> >> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
> >> >
> >> > _______________________________________________
> >> > Wikidata-l mailing list
> >> > Wikidata-l@lists.wikimedia.org
> >> > https://lists.wikimedia.org/mailman/listinfo/wikidata-l
> >> >
> >>
> >> _______________________________________________
> >> Wikidata-l mailing list
> >> Wikidata-l@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
> >
> > _______________________________________________
> > Wikidata-l mailing list
> > Wikidata-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata-l
> >
> 
> _______________________________________________
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l

_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Accelerating software innovation with Wikidata and improved Wikicode

Reply via email to