- From: Bryan Bishop <kanzure@gmail.com>
- Date: Tue, 8 Jul 2008 21:00:52 -0500
- To: "W3C HCLSIG hcls" <public-semweb-lifesci@w3.org>
Hey all, I am new to the list, but I really should have known about this group years ago. I am running a semantic web project oriented that is best summarized as apt-get for physical automation. Today, while writing some perl to steal the humancortex data from the Allen Institute, I stumbled upon the AJAX + SPARQL + RDF + Google Maps API + Ruby on rails implementation that is mentioned in some slides/PDFs. Unfortunately, the hcls1 server on CSAIL at MIT seems to be dead -- I'd be willing to take over some of that code. I think it was here: http://hcls1.csail.mit.edu:8890/map/#Kcnip3@2850,Kcnd1@2800 And the PDF was: http://tinyurl.com/ysqm3z Anyway, the basis of "SKDB", or the metadata for physical automation: http://heybryan.org/exp.html http://heybryan.org/new_exp.html http://oscomak.net/ I am very happy to see gateways like: http://hcls.deri.ie/hcls_demo.html 'The following queries access a SPARQL endpoint hosted at DERI. The underlying triplestore contains over 325 million RDF triples of biomedical information. The information covers a large array of biomedical knowledge: from basic molecular biology over literature annotation up to anatomy and physiology.' Which I suspect is an integration of OBO Foundry, SBML, and other related projects. I've noticed, however, that the major drawback of the majority of these semantic querying interfaces is that there's no plug-and-play functionality that I have found yet -- I hope I am completely wrong here -- but it's too bad that these databases have their tables and their data structures completely hidden instead of floating around as code. It's one of the reasons that I am a user of YAML and object serialization. http://yaml.org/ "YAML: YAML Ain't Markup Language / What It Is: YAML is a human friendly data serialization standard for all programming languages." And the python implementation details (PyYAML): http://pyyaml.org/ I mention this for two strong reasons. First, has anyone seen PyLab? I regret that I keep mentioning python, just because it might show bias, but I actually do a significant amount of work not in python, so it's really just because I was going through the majority of known scientific number crunching packages: http://heybryan.org/num.html like Axiom, derive, macsyma, maple, mathematica, MATLAB, mupad, reduce, R, octave, sage, numpy, scipy, PDL, sympy, and next up on my list is CAD/CAM packages, which I suspect will have significant cross over with the CFD packages out there. Anyway, PyLab: http://scipy.org/PyLab "SciPy (pronounced "Sigh Pie") is open-source software for mathematics, science, and engineering. It is also the name of a very popular conference on scientific programming with Python. The SciPy library depends on NumPy, which provides convenient and fast N-dimensional array manipulation. The SciPy library is built to work with NumPy arrays, and provides many user-friendly and efficient numerical routines such as routines for numerical integration and optimization. Together, they run on all popular operating systems, are quick to install, and are free of charge. NumPy and SciPy are easy to use, but powerful enough to be depended upon by some of the world's leading scientists and engineers. If you need to manipulate numbers on a computer and display or publish the results, give SciPy a try!" And PyLab is the "total lab integration" mostly on the software side. But still, as I mentioned, I'm interested in extracting functionality out of the web (put it to work for you) instead of static information just sitting there. Somehow I ended up finding out about EXPO: http://expo.sf.net/ "EXPO defines over 200 concepts for creating semantic markup about scientific experiments, using the Web Ontology Language OWL. We propose the ontology EXPO to formalise generic knowledge about scientific experimental design, methodology, and results representation. Such a common ontology is both feasible and desirable because all the sciences follow the same experimental principles. The formal description of experiments for efficient analysis, annotation, and sharing of results is a fundamental objective of science." An interesting example is "The Robot Scientist": http://www.aber.ac.uk/compsci/Research/bio/robotsci/ "The Robot Scientist is perhaps the first physical implementation of the task of Scientific Discovery in a microbiology laboratory. It represents the merging of increasingly automated and remotely controllable laboratory equipment and knowledge discovery techniques. Automation of laboratory equipment (the "Robot" of Robot Scientist) has revolutionised laboratory practice by removing the "drudgery" of constructing many wet lab experiments by hand, allowing an increase in both the scope and scale of potential experiments. Most lab robots only require a simple description of the various chemical/ biological entities to be used in the experiments, along with their required volumes and where these entities are stored. Automation has also given rise to significantly increased productivity and a concomitant increase in the production of results and data requiring interpretation, giving rise to an "interpretation bottleneck" where the process of understanding the results is lagging behind the production of results." So, how are these robots and automation machinery made? Usually in CAD programs. Admittedly the open source solutions to CAD have been known to be, not the best, but the point is still the same -- designers build and generate information, which is then implemented into physical machinery. Those designs are packages and could be made accessible in an automatic manner. At the same time, the machines that this information is being fed into are already automated. So the cybernetic loop, as it were, is nearly complete. It's just that the focus on static information tends to ignore the instrumentation and automation hardware that brought that information in the first place -- the programming and such. So that's how apt-get is interesting (besides being awesome): http://en.wikipedia.org/wiki/Debian http://debian.org/ "Debian (pronounced [ˈdɛbiən]) is a computer operating system (OS) composed entirely of software which is both free and open source (FOSS). Its primary form, Debian GNU/Linux, is a popular and influential Linux distribution.[1] It is a multipurpose OS; it can be used as a desktop or server operating system. Debian is known for strict adherence to the Unix and free software philosophies.[2] Debian is also known for its abundance of options — the current release includes over twenty-six thousand software packages for eleven computer architectures. These architectures range from the Intel/AMD 32-bit/64-bit architectures commonly found in personal computers to the ARM architecture commonly found in embedded systems and the IBM eServer zSeries mainframes.[3] Throughout Debian's lifetime, other distributions have taken it as a basis to develop their own, including: Ubuntu, MEPIS, Dreamlinux, Damn Small Linux, Xandros, Knoppix, Linspire, sidux, Kanotix, and LinEx among others.[4] A university's study concluded that Debian's 283 million source code lines would cost US$10 billion to develop by proprietary means.[5]" "Prominent features of Debian are its APT package management system, its strict policies regarding its packages and the quality of its releases. [6] These practices afford easy upgrades between releases and easy automated installation and removal of packages. Debian uses an open development and testing process. It is developed by volunteers from around the world and supported by donations through SPI, a non-profit umbrella organization for various free software projects.[7]" In particular, apt-get allows users to retrieve packages by unique identifiers, with automatic installation and configuration to the local environment. This involves a significant amount of metadata and lots of overhead for actually transmitting the packages, which has been known to redline the Cisco root nodes for weeks when debian releases major updates. Heh. They are considering (or are they already implementing?) debtorrents and debtags to help bring that down to something less destructive. SKDB/OSCOMAK is a pet project of mine and a handful of other programmers and machine shop enthusiasts interested in making sure when gEDA and OpenCores happens everywhere else, the same infrastructure can be deployed, in a functionally useful manner -- like Gershenfeld's group over at the MIT Media Lab, the 'FabLab' projects. Basically they are quantified shop configurations (much like linux installations) for the physical floor space, with downloadable tools that would be implemented with whatever tools the system has wired up (obviously some things can't make other things). Behind all of this would be a design compiler which works just like a regular compiler except for the resolution of dependencies between metadata describing projects that are to be implemented, whether by hand or by machine. I'm studying manufacturing engineering down at the University of Texas at Austin, also some computational neuroscience. But I've realized that the bootstrapping requirements to make all of this happen are difficult to the extent that even NIST had some troubles with their Virtual Manufacturing projects, to the extent that it would require significant funding and a "leap of faith" from others -- which might be somewhat unreasonable. So I've been putting most of the work in a specific implementation in biology, I put together the do-it-yourself biotechnology kit < http://heybryan.org/new_exp.html and http://biohack.sf.net/ and http://heybryan.org/biotech.git >. Genes are already highly unitized across the semantic web, and could be made to do interesting things, see http://partsregistry.org among others in synthetic biology. With some buddies I've been detailing the design strategies required to make a 'writozyme', a biologically replicable system that would allow individuals to very simply synthesize DNA without conventional DNA synthesizers like http://bioinformatics.org/pogo/ which admittedly already works, but the writozyme methodology would (hopefully) inherit the self-replication functionality. The metadata aspects are just the same as in OBO, SBML, the bioinformatics databases, and so on. And even more importantly it's all 'functional' in that it's not "biobricks" that are being sent across the web, but instead the tools, machinery, and semantic snowball backing it all up and potentially turning into this recursive data acquisition process. Bacteria don't just sit there (unless you did your plate wrong, ugh). Some other interesting guys who are working with me on this: http://diybio.org/ http://openwetware.org/ http://biopunk.org/ and a few others that I am forgetting. I hope I have the right mailing list for talking about these topics :-). The project that led me to the Allen Institute, and then to Science Commons, and now to W3C's HCLSIG group, was my attention to attention: http://heybryan.org/mediawiki/index.php/Sustained_attention Specifically because of Henry Markram's combined work on computational neuroscience (microcolumn simulations of the brain in ~2005) and also, surprisingly, on autism: http://heybryan.org/intense_world_syndrome.html "Autism is a devastating neurodevelopmental disorder with a polygenetic predisposition that seems to be triggered by multiple envi ronmental factors during embryonic and/or early postnatal life. While significant advances have been made in identifying the neuronal structures and cells affected, a unifying theory that could explain the manifold autistic symptoms has still not emerged. Based on recent synaptic, cellular, molecular, microcircuit, and behavioral results obtained with the valproic acid (VPA) rat model of autism, we propose here a unifying hypothesis where the core pathology of the autistic brain is hyper-reactivity and hyper-plasticity of local neuronal circuits. Such excessive neuronal processing in circumscribed circuits is suggested to lead to hyper-perception, hyper-attention, and hyper-memory, which may lie at the heart of most autistic symptoms. In this view, the autistic spectrum are disorders of hyper-functionality, which turns debilitating, as opposed to disorders of hypo-functionality, as is often assumed. We discuss how excessive neuronal processing may render the world painfully intense when the neocortex is affected and even aversive when the amygdala is affected, leading to social and environmental withdrawal. Excessive neuronal learning is also hypothesized to rapidly lock down the individual into a small repertoire of secure behavioral routines that are obsessively repeated. We further discuss the key autistic neuropathologies and several of the main theories of autism and re-interpret them in the light of the hypothesized Intense World Syndrome." When combined with the humancortex datasets from the Allen Institute, things start to get very interesting :-). Throw in some metadata packaging dynamics, like from SKDB or apt-get, suddenly you're programming simulations of neural slices (as we've done for many years now) -- or the actual physical tissue plates -- and you're able to engineer brains. Sort of :-). http://heybryan.org/buildingbrains.html http://heybryan.org/recursion.html 'At least' you're able to do some interesting science + neurofeedback, one of my intentions. So, that's the direction that I'm coming from. It looks like I completely missed Science Commons when it showed up on the map, and I deeply regret this. Are there any other initiatives that I should be made aware of? I'm also approaching all of this from the aerospace angle: http://heybryan.org/2008-05-09.html It's an email I sent to some presenters at ISDC2008 (since I couldn't attend), the National Space Society, OpenVirgle ( http://google.com/virgle (humor is healthy)), and even some Google Lunar X Prize teams, like Interplanetary Ventures, and Team FREDNET, the open source team. Cheers, - Bryan ________________________________________ http://heybryan.org/
Received on Thursday, 10 July 2008 05:23:02 UTC