Re: [Wikidata-l] Accelerating software innovation with Wikidata and improved Wikicode

Jona Christopher Sahnwaldt Sun, 07 Jul 2013 13:48:10 -0700

I think software and most other engineering products need a much
higher level of coherence than artefacts that are consumed by humans,
like Wikipedia. Wikipedia is full of inconsistencies and even
contradictions. When I browse Wikipedia, I often stumble upon
statements on one page that are contradicted by another page. That's
not a big deal - sometimes I can easily tell which statement is
correct (and fix the other pages), sometimes I can't, but either way,
I am not a computer: I don't follow these statements blindly. In a
computer program, such inconsistencies would lead to erratic behavior,
i.e., bugs. This means that a completely open wiki process will not
work for software.


In a way, a lot of open source software is developed in a restricted
wiki way: someone proposes a change, but before it is merged, it is
checked by people who (hopefully) know all the nooks and crannies of
the existing code. A bit like edit-protected pages in Wikipedia:
everyone can propose changes on the talk page, but only admins can
actually make these changes.

Christopher

On 7 July 2013 04:16, Michael Hale <[email protected]> wrote:
> I'm glad you mentioned that the same issue applies to electronics. I suppose
> I could have just referred to Moore's law instead of the relatively recent
> increasing size of datacenters. I like asking computers to work hard, but I
> find it hard to think of valuable things for them to do. You can play a new
> game or donate time to BOINC, but not very many great games are produced
> each year and BOINC typically runs algorithms that benefit humanity but not
> specifically you. For example, my genetics tests say I have an increased
> risk of prostate cancer, so I'd like to be able to tell Folding@home to
> focus on the proteins that are most relevant for the diseases I'm most
> likely to get.
>
> I still have hope that a more wiki-like model could work for developing
> software libraries though. The problems of technical design in software and
> hardware are similar, but software can be developed more fluidly and rapidly
> due to the lower barrier to entry and non-existent manufacturing costs.
> Essentially all electronics are designed and simulated with software prior
> to constructing physical prototypes these days.
>
> I've thought about the integration problem some, but I haven't ironed out
> how it would all work yet. I think standard object-oriented programming and
> modeling techniques have been absorbed by enough programmers that it might
> be worth a shot though. Essentially, each article would have a standard
> class and supporting data structures or file formats for the inputs and
> outputs of its algorithms. It would be like the typical flow chart or visual
> programming languages you can use with libraries like Modelica, but on a
> larger scale and the formats would often be more complex. So, like, you
> would have a class representing a cloud, with flags for different
> representations (density values in a cubic grid, collections of point
> particles, polygonal shape approximations, etc) which are used for different
> algorithms. So then you would have code that can convert between all of the
> representations, code for generating random clouds (with potentially lots of
> optional parameters to specify atmospheric conditions), code for outputting
> images of the generated clouds in different styles, and algorithms for
> manipulating them through time. Then if I wanted to see the effects on a
> specific cloud I've made drifting over the ocean in different atmospheric
> conditions, I could grab the code to instantiate 3D Euclidean space with a
> virtual camera, add some gravity, add some ground, add some water, add an
> atmosphere, add my cloud, and then simulate it with adjustable parameters
> for the accuracy and speed of computation. Now, there are a lot of details
> that leaves out, but I don't know of another way to easily mix capabilities
> from high-end graphics software and various specialized simulation
> algorithms in lots of ways. Graphics software typically gives you some
> simulation capabilities, and simulation software typically gives you some
> graphics functionality, but I want lots of both.
>
> I think having more semantic annotation tools will be great, but I don't
> spend most of my time doing searches. There is an astounding amount of
> information, data, and media on the internet, but it's not hard to find the
> edge if you really try. It's pretty crazy if you search for images of "blue
> bear" how many results come up, but if you search for "blue bear and green
> gorilla" you don't get anything useful. Then you get to face the craziness
> of how many options you have for combining a picture of a blue bear and a
> different picture of a green gorilla into one picture. I think it's
> interesting what they are trying with the Wolfram Alpha website, but they
> will always have to limit the time of the computations they allow you to do
> on their servers, so that's why I think we need better libraries to more
> easily program the computers we have direct control over.
>
> ________________________________
> From: [email protected]
> Date: Sat, 6 Jul 2013 17:49:41 -0400
> To: [email protected]
> Subject: Re: [Wikidata-l] Accelerating software innovation with Wikidata and
> improved Wikicode
>
>
> Thanks for sharing your thoughts Michael, it is also something that has been
> bothering me for a while and not only in programming, also in other
> technical domains like electronics.
>
> In my opinion, the reason why programming (or technical design in general)
> couldn't follow the wiki world is because it has some structural differences
> that require a different approach. To start with, there is the problem of
> integration, where code solutions are usually part of a larger system and
> they cannot be isolated or combined with others blocks as easily as you
> would combine text fragments like in Wikipedia. I'm sure that all those 10
> open file examples have some particularities about the operative system,
> method, supporting libraries, etc.
> The part of scavenging and gluing will be always there unless you follow the
> approach used in hardware design (wp: semiconductor intellectual property
> core).
>
> Since that kind of modularity trend is hard to set up at large scale other
> than what is already stablished, it would be more practical to focus on what
> can be improved more easily, which is the scavenging. Instead of copying
> code fragments, it would be better to point to the fragment in the source
> code project itself, while at the same time providing the semantic tags
> necessary for describing that fragment. This can be done (more or less) with
> current existing semantic annotation technology (see thepund.it and Dbpedia
> Spotlight).
>
> If this has not been done before it is maybe because semantic tools are now
> in the transition from "adaptation of an emerging technology" into "social
> appropriation of that technology". For the wiki concept it took 6 years for
> it to be transformed into wikipedia, more or less the same amount of years
> between SMW and Wikidata. Semantic annotation of code will eventually
> happen, how fast it will depend on interest in such a tool and the success
> of the supporting technologies.
>
> Micru
>
>
> On Sat, Jul 6, 2013 at 3:10 PM, Michael Hale <[email protected]>
> wrote:
>
> I have been pondering this for some time, and I would like some feedback. I
> figure there are many programmers on this list, but I think others might
> find it interesting as well.
>
> Are you satisfied with our progress in increasing software sophistication as
> compared to, say, increasing the size of datacenters? Personally, I think
> there is still too much "reinventing the wheel" going on, and the best way
> to get to software that is complex enough to do things like high-fidelity
> simulations of virtual worlds is to essentially crowd-source the translation
> of Wikipedia into code. The existing structure of the Wikipedia articles
> would serve as a scaffold for a large, consistently designed, open-source
> software library. Then, whether I was making software for weather prediction
> and I needed code to slowly simulate physically accurate clouds or I was
> making a game and I needed code to quickly draw stylized clouds I could just
> go to the article for clouds, click on C++ (or whatever programming language
> is appropriate) and then find some useful chunks of code. Every article
> could link to useful algorithms, data structures, and interface designs that
> are relevant to the subject of the article. You could also find data-centric
> programs too. Like, maybe a JavaScript weather statistics browser and
> visualizer that accesses Wikidata. The big advantage would be that
> constraining the design of the library to the structure of Wikipedia would
> handle the encapsulation and modularity aspects of the software engineering
> so that the components could improve independently. Creating a simulation or
> visualization where you zoom in from a whole cloud to see its constituent
> microscopic particles is certainly doable right now, but it would be a lot
> easier with a function library like this.
>
> If you look at the existing Wikicode and Rosetta Code the code samples are
> small and isolated. They will show, for example, how to open a file in 10
> different languages. However, the search engines already do a great job of
> helping us find those types of code samples across blog posts of people who
> have had to do that specific task before. However, a problem that I run into
> frequently that the search engines don't help me solve is if I read a
> nanoelectronics paper and I want to do a simulation of the physical system
> they describe I often have to go to the websites of several different
> professors and do a fair bit of manual work to assemble their different
> programs into a pipeline, and then the result of my hacking is not easy to
> expand to new scenarios. We've made enough progress on Wikipedia that I can
> often just click on a couple of articles to get an understanding of the
> paper, but if I want to experiment with the ideas in a software context I
> have to do a lot of scavenging and gluing.
>
> I'm not yet convinced that this could work. Maybe Wikipedia works so well
> because the internet reached a point where there was so much redundant
> knowledge listed in many places that there was immense social and economic
> pressure to utilize knowledgeable people to summarize it in a free
> encyclopedia. Maybe the total amount of software that has been written is
> still too small, there are still too few programmers, and it's still too
> difficult compared to writing natural languages for the crowdsourcing
> dynamics to work. There have been a lot of successful open-source software
> projects of course, but most of them are focused on creating software for a
> specific task instead of library components that cover all of the knowledge
> in the encyclopedia.
>
> _______________________________________________
> Wikidata-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
>
>
> --
> Etiamsi omnes, ego non
> _______________________________________________ Wikidata-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
> _______________________________________________
> Wikidata-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>

_______________________________________________
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Accelerating software innovation with Wikidata and improved Wikicode

Reply via email to