Re: Carbon Efficiency of Semantic Web and Linked Data Queries

While we in the Semantic Web / Linked Data community don't seem to fall
into the category of worst offenders in energy consumption, (I am just
looking at the forecast and data traffic breakdown on the internet[1] and
the remarks made by the data-centre expert in Cheltenham[2] that digital
mobile camera phone sobriety could reduce data traffic in Europe by 40%
immediately) current federated SPARQL queries seem to be less efficient
than one would have hoped for 20 years ago.[3] You are probably doing more
for your carbon footprint by turning off your monitor completely rather
than leaving it in stand-by mode [4] than by optimizing your federated
SPARQL queries or going way of Solid Pods. It seems to be still difficult
to estimate the number of deployed SPARQL solutions in industry and their
footprint in terms of resource allocation. One of the best known projects
but still heavily centralized SPARQL services the wikidata WDQS has a
rather modest footprint if you go by the numbers published recently [5].

Still and since this is my subject interest here the support and
implementation for federated SPARQL query solutions is surprisingly
underdeveloped [3] . Looking forward to learn more about updates here from
QuWeDa 2019 [6]

[1]
https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/white-paper-c11-741490.html
[2] https://www.nature.com/articles/d41586-018-06610-y
[3] https://svn.aksw.org/papers/2017/FedEval-summary/public.pdf
[4]
https://www.energuide.be/en/questions-answers/how-much-power-does-a-computer-use-and-how-much-co2-does-that-represent/54/
[5]
https://wikitech.wikimedia.org/wiki/Wikidata_query_service/ScalingStrategy
[6] https://sites.google.com/site/quweda2019/home


On Mon, Jun 17, 2019 at 8:31 PM Zachary Whitley <zachary.whitley@gmail.com>
wrote:

> I wanted to add some perspective. The principal components of aluminum
> refining are electricity and carbon and takes a significant amount of
> electricity and produces large amounts of greenhouse gasses. Most of the
> electricity consumed is produced by coal. Yes, we should be concerned about
> energy consumption for computing but I wouldn't be surprised if you would
> save more electricity and produce fewer greenhouse gasses by *expending*
> computing resources on making aluminum production and recycling more
> efficient.
>
> [1] https://en.wikipedia.org/wiki/Aluminium_smelting
> [2]
> http://www.world-aluminium.org/statistics/primary-aluminium-smelting-power-consumption/#histogram
>
> On Mon, Jun 17, 2019 at 3:09 PM Steffen Staab <staab@uni-koblenz.de>
> wrote:
>
>> I don’t believe that a case can be made for physically decentrallized p2p
>> being more energy efficient.
>>
>> 1. Compute centers can be placed where energy is cheap and cooling
>> inexpensive.
>> Indeed this has been done a lot.
>>
>> 2. Cooling reduces energy needs. Generated warmth could even be re-used.
>> Not thinkable for a DSL-box.
>>
>> 3. Modern CPUs use less energy when unused. There is less need to re-use
>> unnecessary compute cycles
>> in DSL boxes (well, I guess these modern CPUs are only in laptops so far
>> - still).
>>
>> 4. decentralized energy production is good. Globally, however, people
>> increasingly live in cities. This is not where most
>> energy is or will be produced (though it can become more than today).
>>
>> For sure, there is a lot of fruitful, middle ground between going for DSL
>> boxes vs all using the same centralized compute center.
>> I don’t believe in the extremely decentralized scenarios very much.
>>
>> Steffen
>>
>>
>>
>> Am 17.06.2019 um 17:38 schrieb Henry Story <henry.story@bblfish.net>:
>>
>>
>>
>> On 17 Jun 2019, at 01:14, Marco Neumann <marco.neumann@gmail.com> wrote:
>>
>> I would agree Henry. I think p2p networks are provably more cost
>> efficient than centralized services in particular for small data providers.
>> I think there now could be made a case with regards to energy efficiency..
>> Taking your example of underused resources I would not be surprised to
>> finding big tech already taking advantage of this network infrastructure of
>> the underutilized nodes (aka your browser) rather than benefiting the
>> individual end-users directly.
>>
>>
>> also good point with regards to using local resources,  similar to modern
>> energy networks where most of the budget is not consumed by its production
>> but its transportation, storage and infrastructure.
>>
>> Is there work on p2p search for solid pods underway? I need to look at
>> HTTP/2 and solid pods more closely I guess. my pod on solid.community is
>> currently not in a good shape and I am not really having the feeling of
>> being in control of my own data. Is it more advisable to run my own solid
>> pod?
>>
>> https://neumann.solid.community/public/
>>
>>
>> It depends on how much you want to involve yourself in these early stages.
>>
>> In 1993 I installed Linux on my father’s 40Mhz Laptop to see how well it
>> fared,
>> but it required quite a lot of knowledge to do that. Now everybody runs
>> Linux
>> on their phone and calls it Android.
>>
>> At this point the cloud version would be less work to get going I guess
>> :-)
>>
>> I think of the web when deployed on individual instances as peer to peer,
>> and with Solid it really is so, since for example you authenticating to a
>> server,
>> requires the Guard to become a client to fetch data from another server.
>> Each node can be in one and the other role at different times - which is
>> not
>> to say that some nodes like browsers won’t specialize.
>>
>> P2P file sharing with duplication of content across nodes should really be
>> named something else, more like distributed content sharing. Adding such
>> features
>> on Solid pods would be possible, but I think they are trying to restrict
>> to keep focus.
>> Adding it the right way - with RDF data to link to other copies on other
>> pods - would
>> be a nice research project. Perhaps the most important place to add that
>> for
>> Solid servers would be as distributed (encrypted) backups of one's pod on
>> friends pods.
>>
>> Henry
>>
>>
>>
>> On Sun, Jun 16, 2019 at 5:25 PM Henry Story <henry.story@bblfish.net>
>> wrote:
>>
>>> My guess is that such studies have not been done, mostly because
>>> widespread
>>> deployment as would happen if Solid became widespread has not happened
>>> yet.
>>>
>>> But there are some reasons one could be optimistic.
>>>
>>> 1. everyone has a DSL box at home currently that is on and not doing much
>>> a lot of the day, so consuming energy for nothing. Instead with Solid
>>> Pods
>>> those would be doing something useful, and could use electricity from
>>> solar
>>> energy produced locally. So you don’t increase local electricity costs
>>> that much, you can use locally produced electricity, but you increase
>>> some
>>> consumption of data.
>>>
>>> 2. It is likely that most people communicate with local friends, and in
>>> most case don’t cross frontiers due to language barriers. This may not be
>>> the case for the W3C community, but for the wider populations this is a
>>> lot more likely.  So in a way Solid pods communicating with local friends
>>> would use less energy, since packets would not need to be sent around the
>>> world.
>>>
>>> 3. There are a lot of optimization strategies that can be made by having
>>> widely deployed pods. For example used in p2p networks, by fetching
>>> copies
>>> of data heavy media in the nearest cache.
>>>
>>> 4. With the internet of things growing, having the packets stay as far as
>>> required in the home rather than go to large service providers, should
>>> also improve data costs as well as privacy. That is the role of a local
>>> DSL
>>> box turned into a data pod is in any case going to grow in importance,
>>> so
>>> one may as well use this growing infrastructure.
>>>
>>> Since producing energy locally is more efficient, and communicating
>>> locally
>>> when that is needed is better, there are reasons to think that some of
>>> the advantages of large providers may be offset in other ways. That is
>>> without counting the huge improvements in efficiency in communication
>>> that come with HTTP2, reactive frameworks, and cpu efficiencies.
>>>
>>> Henry
>>>
>>> > On 16 Jun 2019, at 12:41, Marco Neumann <marco.neumann@gmail.com>
>>> wrote:
>>> >
>>> > Has anybody done work on Carbon Efficiency of Semantic Web and Linked
>>> Data Queries?
>>> >
>>> > The very nature of distributed data sets has to come with a
>>> substantial computational footprint every time a query is issued to a
>>> single node or a cluster of nodes for a federated query. On the other hand
>>> decentralization might actually outperform more centralized services in the
>>> future.
>>> >
>>> > I can find a number of papers and articles related to carbon
>>> efficiency in general computing and cloud computing environments and data
>>> centers but nothing specifically related to the improvement of operational
>>> efficiency introduced by Semantic Web and Linked Data infrastructures.
>>> >
>>> > There is CO2GLE which attempts to estimate the CO2 emissions per
>>> second released by web search engines like Google as a reference here:
>>> >
>>> >
>>> https://qz.com/1267709/every-google-search-results-in-co2-emissions-this-real-time-dataviz-shows-how-much/
>>> >
>>> >
>>> > Regards,
>>> > Marco
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> >
>>> > ---
>>> > Marco Neumann
>>> > KONA
>>> >
>>>
>>>
>>
>>

-- 


---
Marco Neumann
KONA

Received on Tuesday, 18 June 2019 10:28:56 UTC