Re: Carbon Efficiency of Semantic Web and Linked Data Queries

Thought this might be of relevance to the discussion, re global data
infrastructures (from my LinkedIn feed):

https://www.digitalinformationworld.com/2019/06/the-world-s-most-creative-data-centers-infographic.html

On Tue, Jun 18, 2019 at 6:34 AM Marco Neumann <marco.neumann@gmail.com>
wrote:

> While we in the Semantic Web / Linked Data community don't seem to fall
> into the category of worst offenders in energy consumption, (I am just
> looking at the forecast and data traffic breakdown on the internet[1] and
> the remarks made by the data-centre expert in Cheltenham[2] that digital
> mobile camera phone sobriety could reduce data traffic in Europe by 40%
> immediately) current federated SPARQL queries seem to be less efficient
> than one would have hoped for 20 years ago.[3] You are probably doing more
> for your carbon footprint by turning off your monitor completely rather
> than leaving it in stand-by mode [4] than by optimizing your federated
> SPARQL queries or going way of Solid Pods. It seems to be still difficult
> to estimate the number of deployed SPARQL solutions in industry and their
> footprint in terms of resource allocation. One of the best known projects
> but still heavily centralized SPARQL services the wikidata WDQS has a
> rather modest footprint if you go by the numbers published recently [5].
>
> Still and since this is my subject interest here the support and
> implementation for federated SPARQL query solutions is surprisingly
> underdeveloped [3] . Looking forward to learn more about updates here from
> QuWeDa 2019 [6]
>
> [1]
> https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/white-paper-c11-741490.html
> [2] https://www.nature.com/articles/d41586-018-06610-y
> [3] https://svn.aksw.org/papers/2017/FedEval-summary/public.pdf
> [4]
> https://www.energuide.be/en/questions-answers/how-much-power-does-a-computer-use-and-how-much-co2-does-that-represent/54/
> [5]
> https://wikitech.wikimedia.org/wiki/Wikidata_query_service/ScalingStrategy
> [6] https://sites.google.com/site/quweda2019/home
>
>
> On Mon, Jun 17, 2019 at 8:31 PM Zachary Whitley <zachary.whitley@gmail.com>
> wrote:
>
>> I wanted to add some perspective. The principal components of aluminum
>> refining are electricity and carbon and takes a significant amount of
>> electricity and produces large amounts of greenhouse gasses. Most of the
>> electricity consumed is produced by coal. Yes, we should be concerned about
>> energy consumption for computing but I wouldn't be surprised if you would
>> save more electricity and produce fewer greenhouse gasses by *expending*
>> computing resources on making aluminum production and recycling more
>> efficient.
>>
>> [1] https://en.wikipedia.org/wiki/Aluminium_smelting
>> [2]
>> http://www.world-aluminium.org/statistics/primary-aluminium-smelting-power-consumption/#histogram
>>
>> On Mon, Jun 17, 2019 at 3:09 PM Steffen Staab <staab@uni-koblenz.de>
>> wrote:
>>
>>> I don’t believe that a case can be made for physically decentrallized
>>> p2p being more energy efficient.
>>>
>>> 1. Compute centers can be placed where energy is cheap and cooling
>>> inexpensive.
>>> Indeed this has been done a lot.
>>>
>>> 2. Cooling reduces energy needs. Generated warmth could even be re-used..
>>> Not thinkable for a DSL-box.
>>>
>>> 3. Modern CPUs use less energy when unused. There is less need to re-use
>>> unnecessary compute cycles
>>> in DSL boxes (well, I guess these modern CPUs are only in laptops so far
>>> - still).
>>>
>>> 4. decentralized energy production is good. Globally, however, people
>>> increasingly live in cities. This is not where most
>>> energy is or will be produced (though it can become more than today).
>>>
>>> For sure, there is a lot of fruitful, middle ground between going for
>>> DSL boxes vs all using the same centralized compute center.
>>> I don’t believe in the extremely decentralized scenarios very much.
>>>
>>> Steffen
>>>
>>>
>>>
>>> Am 17.06.2019 um 17:38 schrieb Henry Story <henry.story@bblfish.net>:
>>>
>>>
>>>
>>> On 17 Jun 2019, at 01:14, Marco Neumann <marco.neumann@gmail.com> wrote:
>>>
>>> I would agree Henry. I think p2p networks are provably more cost
>>> efficient than centralized services in particular for small data providers.
>>> I think there now could be made a case with regards to energy efficiency.
>>> Taking your example of underused resources I would not be surprised to
>>> finding big tech already taking advantage of this network infrastructure of
>>> the underutilized nodes (aka your browser) rather than benefiting the
>>> individual end-users directly.
>>>
>>>
>>> also good point with regards to using local resources,  similar to
>>> modern energy networks where most of the budget is not consumed by its
>>> production but its transportation, storage and infrastructure.
>>>
>>> Is there work on p2p search for solid pods underway? I need to look at
>>> HTTP/2 and solid pods more closely I guess. my pod on solid.community is
>>> currently not in a good shape and I am not really having the feeling of
>>> being in control of my own data. Is it more advisable to run my own solid
>>> pod?
>>>
>>> https://neumann.solid.community/public/
>>>
>>>
>>> It depends on how much you want to involve yourself in these early
>>> stages.
>>>
>>> In 1993 I installed Linux on my father’s 40Mhz Laptop to see how well it
>>> fared,
>>> but it required quite a lot of knowledge to do that. Now everybody runs
>>> Linux
>>> on their phone and calls it Android.
>>>
>>> At this point the cloud version would be less work to get going I guess
>>> :-)
>>>
>>> I think of the web when deployed on individual instances as peer to peer,
>>> and with Solid it really is so, since for example you authenticating to
>>> a server,
>>> requires the Guard to become a client to fetch data from another server..
>>> Each node can be in one and the other role at different times - which is
>>> not
>>> to say that some nodes like browsers won’t specialize.
>>>
>>> P2P file sharing with duplication of content across nodes should really
>>> be
>>> named something else, more like distributed content sharing. Adding such
>>> features
>>> on Solid pods would be possible, but I think they are trying to restrict
>>> to keep focus.
>>> Adding it the right way - with RDF data to link to other copies on other
>>> pods - would
>>> be a nice research project. Perhaps the most important place to add that
>>> for
>>> Solid servers would be as distributed (encrypted) backups of one's pod
>>> on friends pods.
>>>
>>> Henry
>>>
>>>
>>>
>>> On Sun, Jun 16, 2019 at 5:25 PM Henry Story <henry.story@bblfish.net>
>>> wrote:
>>>
>>>> My guess is that such studies have not been done, mostly because
>>>> widespread
>>>> deployment as would happen if Solid became widespread has not happened
>>>> yet.
>>>>
>>>> But there are some reasons one could be optimistic.
>>>>
>>>> 1. everyone has a DSL box at home currently that is on and not doing
>>>> much
>>>> a lot of the day, so consuming energy for nothing. Instead with Solid
>>>> Pods
>>>> those would be doing something useful, and could use electricity from
>>>> solar
>>>> energy produced locally. So you don’t increase local electricity costs
>>>> that much, you can use locally produced electricity, but you increase
>>>> some
>>>> consumption of data.
>>>>
>>>> 2. It is likely that most people communicate with local friends, and in
>>>> most case don’t cross frontiers due to language barriers. This may not
>>>> be
>>>> the case for the W3C community, but for the wider populations this is a
>>>> lot more likely.  So in a way Solid pods communicating with local
>>>> friends
>>>> would use less energy, since packets would not need to be sent around
>>>> the
>>>> world.
>>>>
>>>> 3. There are a lot of optimization strategies that can be made by having
>>>> widely deployed pods. For example used in p2p networks, by fetching
>>>> copies
>>>> of data heavy media in the nearest cache.
>>>>
>>>> 4. With the internet of things growing, having the packets stay as far
>>>> as
>>>> required in the home rather than go to large service providers, should
>>>> also improve data costs as well as privacy. That is the role of a local
>>>> DSL
>>>> box turned into a data pod is in any case going to grow in importance,
>>>> so
>>>> one may as well use this growing infrastructure.
>>>>
>>>> Since producing energy locally is more efficient, and communicating
>>>> locally
>>>> when that is needed is better, there are reasons to think that some of
>>>> the advantages of large providers may be offset in other ways. That is
>>>> without counting the huge improvements in efficiency in communication
>>>> that come with HTTP2, reactive frameworks, and cpu efficiencies.
>>>>
>>>> Henry
>>>>
>>>> > On 16 Jun 2019, at 12:41, Marco Neumann <marco.neumann@gmail.com>
>>>> wrote:
>>>> >
>>>> > Has anybody done work on Carbon Efficiency of Semantic Web and Linked
>>>> Data Queries?
>>>> >
>>>> > The very nature of distributed data sets has to come with a
>>>> substantial computational footprint every time a query is issued to a
>>>> single node or a cluster of nodes for a federated query. On the other hand
>>>> decentralization might actually outperform more centralized services in the
>>>> future.
>>>> >
>>>> > I can find a number of papers and articles related to carbon
>>>> efficiency in general computing and cloud computing environments and data
>>>> centers but nothing specifically related to the improvement of operational
>>>> efficiency introduced by Semantic Web and Linked Data infrastructures.
>>>> >
>>>> > There is CO2GLE which attempts to estimate the CO2 emissions per
>>>> second released by web search engines like Google as a reference here:
>>>> >
>>>> >
>>>> https://qz.com/1267709/every-google-search-results-in-co2-emissions-this-real-time-dataviz-shows-how-much/
>>>> >
>>>> >
>>>> > Regards,
>>>> > Marco
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> >
>>>> > ---
>>>> > Marco Neumann
>>>> > KONA
>>>> >
>>>>
>>>>
>>>
>>>
>
> --
>
>
> ---
> Marco Neumann
> KONA
>
> --
David McDonell Co-founder & CEO ICONICLOUD, Inc. "Illuminating the cloud"
M: 703-864-1203 EM: david@iconicloud.com URL: http://iconicloud.com

Received on Thursday, 20 June 2019 09:17:55 UTC