Re: Carbon Efficiency of Semantic Web and Linked Data Queries

Bitcoin itself uses up 67.3 TWh, more than Switzerland and less than the Czech Republic
according to this page
   https://digiconomist.net/bitcoin-energy-consumption <https://digiconomist.net/bitcoin-energy-consumption>

Trying to reach global consensus is expensive. 
Linked Data allows local consensus, which is much cheaper.


> On 19 Jun 2019, at 23:09, David McDonell <david@iconicloud.com> wrote:
> 
> And here’s a frothy commercial sector industry report on data center concentration (including AWS) in N.Virginia (DC metro area) citing MWattage consumption numbers, drawn from standard grid sources:
> 
> https://www.datacenterknowledge.com/amazon/why-northern-virginia-data-center-market-bigger-most-realize <https://www.datacenterknowledge.com/amazon/why-northern-virginia-data-center-market-bigger-most-realize>
> Point is, carbon efficiency has to address the backbone infrastructure dimension; edge/end-user profiles are feel-good but dwarfed in comparison.
> 
> On Wed, Jun 19, 2019 at 4:57 PM David McDonell <david@iconicloud.com <mailto:david@iconicloud.com>> wrote:
> I think those latter three G-locations have abundant nuke power from the ‘local’ grid; whole different set of issues there;-)
> 
> On Wed, Jun 19, 2019 at 12:06 PM Marco Neumann <marco.neumann@gmail.com <mailto:marco.neumann@gmail.com>> wrote:
> I like the way Google is going almost carbon neutral here in Hamina Finland by way of using cold seawater to cool systems. I hope they will also hook up the onsite sauna* to use excess HPC heat soon ;)
> 
> I am still surprised they continue to run supercomputer clusters in places like Texas (Frontera), Tennessee (Summit) and Livermore, CA (Sierra)
> 
> https://medium.com/arcticstartup-news/saunas-to-use-data-centres-excess-heat-c552e70946b <https://medium.com/arcticstartup-news/saunas-to-use-data-centres-excess-heat-c552e70946b>  
> 
> On Wed, Jun 19, 2019 at 2:17 PM David McDonell <david@iconicloud.com <mailto:david@iconicloud.com>> wrote:
> Thought this might be of relevance to the discussion, re global data infrastructures (from my LinkedIn feed):
> 
> https://www.digitalinformationworld.com/2019/06/the-world-s-most-creative-data-centers-infographic.html <https://www.digitalinformationworld.com/2019/06/the-world-s-most-creative-data-centers-infographic.html>
> 
> On Tue, Jun 18, 2019 at 6:34 AM Marco Neumann <marco.neumann@gmail.com <mailto:marco.neumann@gmail.com>> wrote:
> While we in the Semantic Web / Linked Data community don't seem to fall into the category of worst offenders in energy consumption, (I am just looking at the forecast and data traffic breakdown on the internet[1] and the remarks made by the data-centre expert in Cheltenham[2] that digital mobile camera phone sobriety could reduce data traffic in Europe by 40%  immediately) current federated SPARQL queries seem to be less efficient than one would have hoped for 20 years ago.[3] You are probably doing more for your carbon footprint by turning off your monitor completely rather than leaving it in stand-by mode [4] than by optimizing your federated SPARQL queries or going way of Solid Pods. It seems to be still difficult to estimate the number of deployed SPARQL solutions in industry and their footprint in terms of resource allocation. One of the best known projects but still heavily centralized SPARQL services the wikidata WDQS has a rather modest footprint if you go by the numbers published recently [5].
> 
> Still and since this is my subject interest here the support and implementation for federated SPARQL query solutions is surprisingly underdeveloped [3] . Looking forward to learn more about updates here from QuWeDa 2019 [6]
> 
> [1] https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/white-paper-c11-741490.html <https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/white-paper-c11-741490.html>
> [2] https://www.nature.com/articles/d41586-018-06610-y <https://www.nature.com/articles/d41586-018-06610-y>
> [3] https://svn.aksw.org/papers/2017/FedEval-summary/public.pdf <https://svn.aksw.org/papers/2017/FedEval-summary/public.pdf>
> [4] https://www.energuide.be/en/questions-answers/how-much-power-does-a-computer-use-and-how-much-co2-does-that-represent/54/ <https://www.energuide.be/en/questions-answers/how-much-power-does-a-computer-use-and-how-much-co2-does-that-represent/54/>
> [5] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/ScalingStrategy <https://wikitech.wikimedia.org/wiki/Wikidata_query_service/ScalingStrategy>
> [6] https://sites.google.com/site/quweda2019/home <https://sites.google.com/site/quweda2019/home>
> 
> 
> On Mon, Jun 17, 2019 at 8:31 PM Zachary Whitley <zachary.whitley@gmail.com <mailto:zachary.whitley@gmail.com>> wrote:
> I wanted to add some perspective. The principal components of aluminum refining are electricity and carbon and takes a significant amount of electricity and produces large amounts of greenhouse gasses. Most of the electricity consumed is produced by coal. Yes, we should be concerned about energy consumption for computing but I wouldn't be surprised if you would save more electricity and produce fewer greenhouse gasses by *expending* computing resources on making aluminum production and recycling more efficient.
> 
> [1] https://en.wikipedia.org/wiki/Aluminium_smelting <https://en.wikipedia.org/wiki/Aluminium_smelting>
> [2] http://www.world-aluminium.org/statistics/primary-aluminium-smelting-power-consumption/#histogram <http://www.world-aluminium.org/statistics/primary-aluminium-smelting-power-consumption/#histogram>
> On Mon, Jun 17, 2019 at 3:09 PM Steffen Staab <staab@uni-koblenz..de <mailto:staab@uni-koblenz.de>> wrote:
> I don’t believe that a case can be made for physically decentrallized p2p being more energy efficient.
> 
> 1. Compute centers can be placed where energy is cheap and cooling inexpensive.
> Indeed this has been done a lot. 
> 
> 2. Cooling reduces energy needs. Generated warmth could even be re-used. Not thinkable for a DSL-box.
> 
> 3. Modern CPUs use less energy when unused. There is less need to re-use unnecessary compute cycles
> in DSL boxes (well, I guess these modern CPUs are only in laptops so far - still).
> 
> 4. decentralized energy production is good. Globally, however, people increasingly live in cities. This is not where most
> energy is or will be produced (though it can become more than today).
> 
> For sure, there is a lot of fruitful, middle ground between going for DSL boxes vs all using the same centralized compute center.
> I don’t believe in the extremely decentralized scenarios very much.
> 
> Steffen
> 
> 
> 
>> Am 17.06.2019 um 17:38 schrieb Henry Story <henry.story@bblfish.net <mailto:henry.story@bblfish.net>>:
>> 
>> 
>> 
>>> On 17 Jun 2019, at 01:14, Marco Neumann <marco.neumann@gmail.com <mailto:marco.neumann@gmail.com>> wrote:
>>> 
>>> I would agree Henry. I think p2p networks are provably more cost efficient than centralized services in particular for small data providers. I think there now could be made a case with regards to energy efficiency. Taking your example of underused resources I would not be surprised to finding big tech already taking advantage of this network infrastructure of the underutilized nodes (aka your browser) rather than benefiting the individual end-users directly.
>>> 
>>> also good point with regards to using local resources,  similar to modern energy networks where most of the budget is not consumed by its production but its transportation, storage and infrastructure.
>>> 
>>> Is there work on p2p search for solid pods underway? I need to look at HTTP/2 and solid pods more closely I guess. my pod on solid.community is currently not in a good shape and I am not really having the feeling of being in control of my own data. Is it more advisable to run my own solid pod?
>>> 
>>> https://neumann.solid.community/public/ <https://neumann.solid.community/public/>  
>> 
>> It depends on how much you want to involve yourself in these early stages.
>> 
>> In 1993 I installed Linux on my father’s 40Mhz Laptop to see how well it fared,
>> but it required quite a lot of knowledge to do that. Now everybody runs Linux
>> on their phone and calls it Android. 
>> 
>> At this point the cloud version would be less work to get going I guess :-)
>> 
>> I think of the web when deployed on individual instances as peer to peer,
>> and with Solid it really is so, since for example you authenticating to a server,
>> requires the Guard to become a client to fetch data from another server.
>> Each node can be in one and the other role at different times - which is not
>> to say that some nodes like browsers won’t specialize.
>> 
>> P2P file sharing with duplication of content across nodes should really be
>> named something else, more like distributed content sharing. Adding such features
>> on Solid pods would be possible, but I think they are trying to restrict to keep focus.
>> Adding it the right way - with RDF data to link to other copies on other pods - would
>> be a nice research project. Perhaps the most important place to add that for
>> Solid servers would be as distributed (encrypted) backups of one's pod on friends pods.
>> 
>> Henry
>> 
>>> 
>>> 
>>> On Sun, Jun 16, 2019 at 5:25 PM Henry Story <henry.story@bblfish..net <mailto:henry.story@bblfish.net>> wrote:
>>> My guess is that such studies have not been done, mostly because widespread
>>> deployment as would happen if Solid became widespread has not happened
>>> yet.
>>> 
>>> But there are some reasons one could be optimistic.
>>> 
>>> 1. everyone has a DSL box at home currently that is on and not doing much
>>> a lot of the day, so consuming energy for nothing. Instead with Solid Pods
>>> those would be doing something useful, and could use electricity from solar
>>> energy produced locally. So you don’t increase local electricity costs
>>> that much, you can use locally produced electricity, but you increase some
>>> consumption of data.
>>> 
>>> 2. It is likely that most people communicate with local friends, and in
>>> most case don’t cross frontiers due to language barriers. This may not be
>>> the case for the W3C community, but for the wider populations this is a
>>> lot more likely.  So in a way Solid pods communicating with local friends
>>> would use less energy, since packets would not need to be sent around the
>>> world.
>>> 
>>> 3. There are a lot of optimization strategies that can be made by having
>>> widely deployed pods. For example used in p2p networks, by fetching copies
>>> of data heavy media in the nearest cache.
>>> 
>>> 4. With the internet of things growing, having the packets stay as far as
>>> required in the home rather than go to large service providers, should
>>> also improve data costs as well as privacy. That is the role of a local DSL
>>> box turned into a data pod is in any case going to grow in importance, so 
>>> one may as well use this growing infrastructure.
>>> 
>>> Since producing energy locally is more efficient, and communicating locally
>>> when that is needed is better, there are reasons to think that some of 
>>> the advantages of large providers may be offset in other ways. That is
>>> without counting the huge improvements in efficiency in communication
>>> that come with HTTP2, reactive frameworks, and cpu efficiencies.
>>> 
>>> Henry
>>> 
>>> > On 16 Jun 2019, at 12:41, Marco Neumann <marco.neumann@gmail.com <mailto:marco.neumann@gmail.com>> wrote:
>>> > 
>>> > Has anybody done work on Carbon Efficiency of Semantic Web and Linked Data Queries?
>>> > 
>>> > The very nature of distributed data sets has to come with a substantial computational footprint every time a query is issued to a single node or a cluster of nodes for a federated query. On the other hand decentralization might actually outperform more centralized services in the future. 
>>> > 
>>> > I can find a number of papers and articles related to carbon efficiency in general computing and cloud computing environments and data centers but nothing specifically related to the improvement of operational efficiency introduced by Semantic Web and Linked Data infrastructures.
>>> > 
>>> > There is CO2GLE which attempts to estimate the CO2 emissions per second released by web search engines like Google as a reference here:
>>> > 
>>> > https://qz.com/1267709/every-google-search-results-in-co2-emissions-this-real-time-dataviz-shows-how-much/ <https://qz.com/1267709/every-google-search-results-in-co2-emissions-this-real-time-dataviz-shows-how-much/>
>>> > 
>>> > 
>>> > Regards,
>>> > Marco
>>> > 
>>> > 
>>> > 
>>> > -- 
>>> > 
>>> > 
>>> > ---
>>> > Marco Neumann
>>> > KONA
>>> > 
>>> 
>> 
> 
> 
> 
> -- 
> 
> 
> ---
> Marco Neumann
> KONA
> 
> -- 
> David McDonell Co-founder & CEO ICONICLOUD, Inc. "Illuminating the cloud" M: 703-864-1203 EM: david@iconicloud.com <mailto:david@iconicloud.com> URL: http://iconicloud.com <http://iconicloud.com/>
> 
> -- 
> 
> 
> ---
> Marco Neumann
> KONA
> 
> -- 
> David McDonell Co-founder & CEO ICONICLOUD, Inc. "Illuminating the cloud" M: 703-864-1203 EM: david@iconicloud.com <mailto:david@iconicloud.com> URL: http://iconicloud.com <http://iconicloud.com/>-- 
> David McDonell Co-founder & CEO ICONICLOUD, Inc. "Illuminating the cloud" M: 703-864-1203 EM: david@iconicloud.com <mailto:david@iconicloud.com> URL: http://iconicloud.com <http://iconicloud.com/>

Received on Wednesday, 19 June 2019 23:55:13 UTC