22 May 2023
WOLFGANG TREMMEL: Welcome to the second Plenary of today. I am Wolfgang and I am going to chair this Plenary session, together with my colleague, Moin here, we are both from the Programme Committee and a reminder if you want to join us in the Programme Committee there is an election. So if you would like to put your name forward, send an e‑mail to pc [at] ripe [dot] net, and perhaps you can chair a session next RIPE.
So, the first speaker today is Karin Ahl from Nominet.
She is going to talk about how Sweden built a world‑leading time network.
KARIN AHL: Thank you. Very happy to be here, it's my first conference, I hope to come and visit more of them, I had a good first impression. My name is Karin Ahl, I am the interim CEO of Netnod since a couple of weeks now. I am here in another capacity and that is that I am also in charge of the technology area for time and frequency at the company.
And I'm going to talk a little bit about how we in Sweden built a world‑leading time distribution infrastructure.
I am going to start to look at the reasons and the requirements we had for a national time distribution network, I'm going to outline a little bit around the infrastructure we used to ensure the security, accuracy and redundancy of the services. We work a lot with time as a service. We are going to look into the services we deliver, time has both profits and not‑for‑profit services and we're going to focus on the NTS, it's a very nice adding to the NTP that most of you probably know. And the main players throughout the presentations is not only us at Netnod, it's also the telecom authority, PTS, and the rise, which is the research Institute in Sweden formerly known as SP.
For those of you who don't know much about Netnod. Just short. We are an organisation with more than 25 years working at the very core of the Internet.
We operate the largest Internet Exchange in the Nordics. And we also manage i.root one the world's 13 root name DNS servers. We also provide DNS services to enterprise and some of the largest top level domains in the world.
And also, of course, my area, we develop time services with the highest levels of accuracy and security.
And that's also our youngest child in the family.
We are a neutral organisation, formed and fully owned by the not‑for‑profit organisation Stiftelsen and Utveckling (TU‑Stiftelsen). We are focused to the contribution for the development of the Internet out of three perspectives: We are providing rock solid best in class services at the very core of the Internet, we are ensuring full redundancy and the highest level of service availability in all three business areas that we are running. And we are also playing a role in standards and the governance activities, policy discussions, not only for ourselves and the services we deliver, but also for customers who are interested in it. And might not be able to do it themselves.
And all these three targets are combined in time and frequency.
So, let's look a little bit about the background for time and frequency in Sweden. It started already back in.1997 where we started to deliver the first NTP service based on clocks, but it wasn't until 2014 that we got the big project at Netnod to start to develop a national distribution network. Between all those years, there were lots of things going on. It was pre‑studies discussions, needs were growing from the operators, and a lot of analysis were done from the government, from the operators and also from the telecom authority.
And some of the conclusions were that citizens and critical community services were in need and dependent on the availability of electronic communications. And the electronic communication as such had a big dependency on correct time and frequency. And time and frequency distributed by GNSS was also very easily spoofed or interrupted, that was also part of the analysis, so they wanted to do something different.
And given all those factors, there were a clear identified public need but it was also very clear that the market was not interested in deploying this kind of distribution network at the time.
Just a short sidetrack. GNSS, we talk a lot about it, we use it, it's a great technology when it works. But I'm sure that many of you also know that it's easily interrupted and compromised. I'm sure you heard of examples from your home countries, the border lines, the customs are talking about it, airports and energy sectors, everybody, but when it works, it works and it's good.
So the national time distribution system. A system without GNSS, from a national perspective, could guarantee a more robust and secure distribution network. That was a given result from the studies. And the system had to be robust and available throughout the country. That was also one of the requirements. The services delivered from the system must also be on a price level so that all the operators could actually use the services. So the price was not ‑‑ it couldn't be a barrier for use. Then it would have been a failure.
And as we're also financed by public funding, and the network had to be placed and operated within Sweden and the government needs to have some insight into what we do and they need to be able to make an audit or anything like that. And the funding, we have a very robust funding and financing. As I said at the beginning, it's not only us at Netnod that are involved in this, we also have PTS and the authority, and RISE, who is monitoring the whole system. And it's really a tight corporation. It's a success factor that we use a lot and talk a lot about. The public/private corporation is a form that's not given all the time but in this case it really works, and we are funded from the financing for electronic communication, and that is a fund put together from 1 ppm from the yearly turnover of the operators and it's matched with as much funding from the government. And from that fund they spread it over, for example, redundancy, new fiber paths, emergency power, backups, sync in time, which is us, and, on a yearly basis, we can work on developing the infrastructure, doing better and smarter things for the network, move closer out in the country and establish more nodes
We have a long‑term view on what we do with the public financing and the public planning. It's not on a yearly basis. It's not a market economy in this case.
So, let's move over to the infrastructure and have a look at it.
You can see the map before. On all these sites we have dual nodes. As you can see, they are duplicated on each side. It's all the critical equipment is actually doubled. So you could have a redundancy ‑‑ for example, you can see in the bottom you have white boxes with the red digits on it, it's the caesium clocks and all the parts above it is actually off‑the‑shelf products with different vendors which makes it easier for us to replace if something happens. But it works, it's really off‑the‑shelf products.
The only part that might not be off‑the‑shelf is the NTP and NTS servers on very top of the rack, and we will look into that later on.
On all these sites the 6 time nodes placed in five different cities, we have it in secure bunkers, we have time traceable to UTC, which is monitored as well and produced by SP and RISE. We both have free and commercial services for different SLA and accuracy. And it's operated by us, monitored by RISE and financed by PTS.
And we're looking into more, more nodes and spread it out in the country. I'm not sure that we will take that step, though. There are no technologies coming and we are all the time looking into new technology to be able to transport time on the the longer distance, because it's quite a heavy investment to establish those nodes.
Time is a service, it's something not very often spoken about but we have different services that we both sell and not sell actually, it's both for profit and not for profit. NTP service and the NTS service are examples of the services we supply without profit. So, if you are interested in any of those services, you can just go online and find it on our website and start using it.
If we look on the commercial side, we have Netnod PtP, which is our premium product on a dedicated fiber, highest accuracy, security, etc. But we also have Netnod Time Direct and Netnod Time Remote. Both those services are not the same high accuracy, but it's redundant with high SLAs and you really know the exact time for your industry.
If we look at Netnod Time Direct.
It's actually something that is very easy to start to use. If you are in any of our data centres or if you have an IX port, you can automatically upgrade your account and start using Time. This is a service that is very often used by Finn tech companies or banking, also some of the energy companies in Sweden, they are usually fairly small companies spread out over the company. They use it a lot and it's a good primary source for them.
Netnod Time Remote is the first adding we did alongside PtP and it's time over a VPN channel out to the countryside and this is a way for us to move even further out in the country away from our sites. And it's a CPE, and CPE's clock is set by Netnod's clocks so it's between those two points that we have a VPN connection. This is also something that the energy company and the national energy is looking into right now to use for their extended POPs.
Looking a bit deeper into NTS. I'm sure many of you know NTP and NTP is a vulnerable protocol with no security. We can see a lot of attacks and it's often something that we hear about and customers contacting us saying they have an issue. They are usually put into a very hard attack.
It's also difficult to scale the security. And we started to look into this quite early actually, but it took some time for us to take the step forward, and we looked into adding security to NTP in different ways but we ended up looking into this algorithm that adds authentication and encryption to it. So, it's NTP with an S, and it scales, and it scales in a number that we can't do with any of the other products. And Netnod got involved as co‑authors in the IETF Working Group in 2018 and two years later they had an adapted RFC but it took already as much as three years for us to get it implemented on all our sites. So it's not that easy to just get an ‑‑ get an acception and start rolling it out. We are very proud of it and it's used and it's taken up by the industry as well.
We have it in all the six sites. And all the servers as I mentioned so you can decide from what server you would like to go and get it from. We can see most customers go to our Stockholm sites because those are the ones that you know most of but you can also go to this list here.
As I mentioned earlier on, the hardware NTP is the one that we built on our own but it's not something that we tried to hide or have as a secret. Everything is put on GitHub, of course, and it's an open source. It's actually quite different in a way that it's put and kept on this circuit. So, we know much more about our NTP than if you go to take the NTP from, for example, NTP pool because you can get any source available from anybody. So if you take our NTP, you know it's actually more secure than if you go to the pool.
But it's an open source FPGA code and we're very happy to share it with you. I think you are familiar to the cards and you can work on them. We don't think it's such a big cost but that card is around €5,000 today.
The hardware NTP with NTS is something that we put into a commercial vendor box. It's a white box, we just wanted to try it out and it worked really well so we won't change t there are many other variants possible but for now this really works well for the need we have. There might be a case that we will upgrade it when the traffic increases even more.
To show you some examples on how to connect to it, I'm sure you can find it online on our website. You are fairly digital, all of you, but we can look at the references. It's more easy to get an overview of it.
We have the white papers of course and we have the 'how to'. And in the 'how to', there are a few examples on how to do it but also contacts if you need some support when you try to set it up. We also have a list of Netnod time services but the one that I would like to push forward is actually access communication, I'm not sure if you know about them but they build video systems, hardware supply and software supply. They rolled out the NTS support around Christmas time and we could see an increase in traffic quite heavily at the time. It was actually quite astonishing to follow.
We have white papers: how it works and why it's important to you and your network, how it developed the first hardware for it and also the implementation of network time security.
A few conclusions:
Well, developing this kind of national time distribution network, even though Sweden is a fairly small country, but stretched, demands definitely a lot of expertise and that's not easy to find today. We work hard on recruiting those people to our business but they are really hard to find. You need investments, and you also need cooperation.
The private public cooperation has been a huge success factor and it's critical to us to keep that going because we need long terms in this, it's nothing that you do in a year or two or three. You need like a ten‑year period to have an overview of it.
And also, I want to say that the customer doesn't always know what's right. Because, we can see a lot of the time when I'm talking about time and frequency and the customer say they don't see any need of it and then something happens they come back to us and say do you have a service for me? And we do of course. And lots of countries and economies talking about this today and how to find the motivation, how to roll it out, they need to understand that the need and the insight, it's not there yet but it's coming. So, keep on talking about it.
We also need to be aware that distribution and the model today, technologies today need to change, they need to keep developing, because we see IoT and censors being a big thing right now. We can't really supply them with anything that is RFC level. We have ideas and we have some tests but it's nothing that is accepted yet. But we need to continue the development of new protocols and algorithms all the time.
And I would like to conclude by showing you a small video‑clip, we'll see if it works. And this is to show you how the traffic actually started to roll out. We started the service in Stockholm in February 2022, and at the time we had 1,000 to 2,000 hits per hour, only enthusiasts in the beginning. Nothing happened for eight months and then they attended a hackathon in Berlin, talked and discussed about it and the traffic took off. And during November we could see a rise from 10,000 to 100,000 hits per hour in just a few days. And then our people in charge said over Christmas, nothing it happening, I'm taking a few days off. He was wrong. So, mid‑January, we had a million hits per hour. End of January 2 million hits per hour. So, this is spreading. It's not a huge amount of traffic but it's exciting to see how something just can take off in a few weeks. And you can see this film, we have a longer version as well if you are interested in it. And it's really, I am really happy to share it with you today.
WOLFGANG TREMMEL: Thank you, Karin. I have two online questions, well actually three, but two, and I am reading the first online questions, then we take the mic and then the last online questions.
The first online question is from Alex Band:
"Very nice presentation, speaking of security in software, are you aware of Project Pendulum an open source effort to provide modern NTP and PtP implementations in RIS, a programme in languages that guarantees memory safety? It would be awesome if Netnod and other parts could support this work in some shape or form."
KARIN AHL: Yes, we are aware of it and I know we have been discussing it.
AUDIENCE SPEAKER: My name is Daniel Karrenberg, I am one of the co‑founders of RIPE, I am first and current employee of the RIPE NCC, and I'm speaking only for myself.
Thank you very much, Karin, we have a very long tradition of Swedes talking about time, Bitte Lethberg [phonetic] started it, talking about his clocks on rimus [phonetic] form. Here is my question:
Do you ‑‑ you always talk, of course, because of the funding and so on, about a national time service. And you did say between the lines that you don't mind that people were outside Sweden use your NTS service. Can you be a bit more explicit about that because people might have missed that.
And also, if I'm here in the Netherlands somewhere and I want to use NTS, what's the best strategy? Should I use yours? Are there others?
KARIN AHL: There are others, not many. I think there are two or three suppliers or producers of NTS today. I think there is a local in the Netherlands actually, so ‑‑ I can have a look at that, I'm not the expert. But yes, if you are not situated in Sweden you can also use our services, of course. We try to support other initiatives internationally to have a local supplier of time, but we will try to supply you with time if you need it, yeah.
AUDIENCE SPEAKER: Hi. Giovani. This is great work you guys have done, that we are trying to do something similar to what you are doing but you are much more far ahead, you have much more money. So, I want to just point to something, if you wish to serve more clients within Sweden like IoT base you should connect your server to the NTP pool.
KARIN AHL: We are connected for many years.
AUDIENCE SPEAKER: The traffic it is shown that you get 2 million queries, what's the current numbers that you get? You should get way more than that if you are connected to the pool.
KARIN AHL: We are more than doubled from this and it's quite stable since a few months back so there is something happening, we think there is some sort of hardware implementation somewhere, we don't know who or where or why, so we are actually trying to find out to stimulate more take‑up of it. So, we're following it closely.
AUDIENCE SPEAKER: Thanks.
WOLFGANG TREMMEL: The next question is online again, Carsten Schiefner, no affiliation. He has two questions and I am asking the second one first:
"Will Netnod be feeding pool.ntp.org one way or the other? That would possibly be a great addition to your service."
KARIN AHL: It's a the same as we heard earlier, so we are ‑‑
WOLFGANG TREMMEL: Since I didn't pay attention, I am asking the other question as well. Question 1 was:
"Given the kind of service but also your notion of governmental audits, is this considered a critical service according to NIS 2?"
KARIN AHL: Not yet, no.
AUDIENCE SPEAKER: Niall O'Reilly, and although I am Vice‑Chair of RIPE, I'm putting that hat aside, I am just ‑‑ your map makes a great impression on me, and I can see that there is demand from all over the world, ,but one part of the world stands out as a hot spot of demand and I'm curious about it and I wonder if you'd like to say anything about it.
KARIN AHL: The only thing I can say is we don't know why it's actually looking like that. We are trying to find ‑‑ we have a few different reasons but we don't really know.
WOLFGANG TREMMEL: Last online question from Henrik Kramselund, he wanted to know if you know how much a rack of your weights? The weight?
KARIN AHL: No.
WOLFGANG TREMMEL: Okay, I guess so. I think there are no more questions. Thank you, Karin.
MOIN RAHMAN: In the presentation we have seen a presentation from Vesna about the environmental impact of Internet, let's see some practical numbers to check if we can really do it. Please welcome Peter Ehiwe from Stripe.
PETER EHIWE: Hi. My name is Peter Ehiwe, I'll be talking to you about techniques to reduce network power consumption.
So, a survey was conducted by the Uptime Institute and over 500 IT and data centre operators were, you know, surveyed in that survey, and one of the areas that they really care about in terms of being able to sort the metrics that they really care about is some IT and consumption. They care a lot about PUE, what usage, car upon emissions, but one area they really care about is IT and data certainty power consumption. I will focus on some strategies you can use on your network today to reduce the amount of power that it takes to run your network infrastructure. It's not an exhaustive list of things, but it's one that I feel would appeal to a wide variety of the people in the audience.
So, my focus of my talk will be on that highlighted sustainability metric, hopefully we can drive that down with some of these techniques.
If you look at the research around reducing network power consumption, there are two main research areas. First you have sleep mode, which is about turning off network components to save power. And then you have rate adaption which borrows some idea from micro processors and dynamic and tries to apply that to networking. But that would not be my main focus because it's still not having wide commercial adoption, but a lot of things I'll be talking about with fall into that sleep mode bucket. As I said, it's a non‑exhaustive list but hopefully people can relate to some of these things.
The first is IEEE standard, that effectively replaces continuous idle on the Internet with lower power idle and that allows to you save power by turning off parts of your transmit circuitry when there is little or no traffic. So it works for hundred ethernet or copper transmission, so 100 and 1 gig and 10 gig BSD. And in terms of its effectiveness, I have ‑‑ I just took an image from a research paper and there is an interesting study around the effectiveness of 802.3 A Z as the load on the link increases, what you find is at a 24% load you reach diminishing returns, and effectively the link behaves like an always on ethernet link. However, at 0 percent load, the amount of energy consumed to keep that link up is about 10%, so you have about 90% potential savings there. And as you increase the load, the energy consumed increases so the amount of energy you save over time reduces. But even at 8% load you still have 50% power savings potential.
There are techniques to make this more efficient by using buffering, but basically to keep the link idle for much longer but you need to think about the application and the impact of the increased latency would have on your specific environment.
So the other enhancements which I think is cool, the ability to determine the length of a cable and then adjust TX power accordingly to send the data from point A to point B.
So the second thing I want to talk about in terms of sleep mode techniques is wireless access points. I think the key question is if create an enterprise network or with a lot of access points, you need to ask yourself do you need to run all the APs at all times. At off peak hours, do we need to run those APs as everybody is out of the office or in weekends? I think that even to this mode of hybrid working, these are questions that we need to ask ourselves and the answers will vary from business to business.
And also can we develop an algorithm that dynamically powers up AVs to immediate this load. There was an interesting research on the last question conducted by a group of researchers in conjunction with Intel, enterprise networking and that much investee, and they develop these green clustering algorithm that effectively groups APs in clusters based on special distance and elect like a cluster lead AP that serves that cluster, and you have like surrogate APs that are powered up as the load increases. There is a lot of datasets in the research paper I linked here, but the interesting thing is, they were able to save 20 to 50% in Dartmouth College in terms of the power required to operate the wireless network and in Intel they were able to save 50 to 80 and they also had an additional 10% as set in one hour intervals power saving, so up to 90%. And the reason for that additional power saving was due to specific nuances of your network architect where this the the access points connected to the data switches that were closed connected so they were able to power off the switches as well in addition to powering off some of the access points.
And then the disparity between the savings from Dartmouth College and the Intel wireless LAN is basically due to business requirements and the ‑‑ so, that much required basic connectivity. Intel, it's all about high bandwidth, V N so on and so forth so they he deploy much more densely populated wifi network.
If you have a scaler you are interested in port LEDs. I think another question we need to ask ourselves if we operate really, really large networks, I'm talking about 50K, 100K switches. Do we need all the link LEDs flashing all the time with no one looking at them? Most of the time you look at LEDs when you try and fix a problem with the link. So, what if we could turn off LEDs by default and turn them on when maintenance needs to be done. I did a simple maths here to calculate the savings we could do with 100K switches and there is very queue of them, but even telcos they have loads of networking gear and footprint as well but with an average of 20 cables per switch and if you ‑‑ ‑‑ you would use about 10 to 15 kilowatts of power per hour for a year, so just to power those LEDs and if you think about the cost savings, if you use the world average price, it's 0.165, you are saving 86,000 a year. I think the exciting one is the ability to power, divert that energy and use it to power 830 Irish homes for a year.
So finally I'll talk about rate adaption. This is more kind of theoretical at this point, and it's basically, as I said, takes DVFS and tries to apply to networking. There are not many commercial implementations so I won't spend too much time on it, but if you are interested in reading more about this ‑‑ I see I have 12 seconds left ‑‑ so you can check out the research paper and if you have any questions, let me know or we could talk more later, and I would like to know what green initiatives you are actually working on in your end as well. Thank you.
WOLFGANG TREMMEL: Thank you. I think we have time for one or two questions if there was anything.
Okay. Thank you very much.
The next one is Olena Kushnir, and she is going to talk about regional success stories in the conditions of war.
OLENA KUSHNIR I will be a little bit faster because I have limited in time.
Good day, ladies and gentlemen. I will share some great original success stories in the condition of war. For those who don't know me, my name is Olena Kushnir, I am the co‑founder and director of the Ukranian Communications Operator and Integrator IT Solutions. Additionally, I am an active member of several specialised associations including the Internet association of Ukraine and Ukrainian Union of industrials. As Ukrainian LIR, my company also is a member of RIPE NCC.
Ten years ago, I sat like you at the conference and could I even imagine that Ukraine will be bombed or that we will have the full‑scale war? But life gives us new challenges that make us stronger, and every challenge gives us a new experience. Today, Ukraine specialist overcomes huge obstacles and bring to the communication, telecommunications networks to a new level to stable operation despite all the destructive conditions, and here are some of challenges we face every day:
For example, blackouts, constant missile attacks, destroyed infrastructure, constant cyber attacks, dangerous working conditions of other employers, mobilization of employees and theft of Ukrainian resources.
Let's start with the blackouts. 80% of the energy infrastructure was destroyed by massive attacks. They are operators made a number of correct management decisions. Home providers have actively switched to one technology but business providers have it more difficult because it was necessary to equip technical sites with serious uninterrupted power supply systems and generators. Also providers used tower linking especially in cities freed from the occupations.
In the photo you can see immediately after the lib racial, people went to the central square of the city to charge their mobile phones and to get the opportunity to call their relatives and tell them that I am alive. This is ‑‑ there is an incredible touching photo.
The know how became shelters of unbreakability. The heat and shelter where you can find free tea and a comfortable place for work. Of course unlimited high speed Internet. Our company, for example, has connected the free interpret in three cities in Ukraine in such shelters. You can even imagine what could be seen there. Many offices, students passed exams. While mum is working online, her child is watching cartoons next to her and even beauty salons because somebody did not have time to finish her manicures. It was crazy but it shows that the Ukrainians is really unbreakable.
This is director and owner of the ELIT line provider, only 50 kilometres from Bakhmut where heavy fighting continues. Micola became an example of unbreakable of all Ukrainian sphere. Due to blackouts, due to the lack of employees, he personally spent not one night at the technical site to control the work of generator.
Ukraine is still subjected to large number of missile attacks and drone strikes every day, and the telecommunications infrastructure in various regions suffered serious damage. The enemies were active in attacking civilian and critical infrastructure. We lack equipment and engineers.
On the 9th March this year when Russia launched 81 missiles in a different region of Ukraine, some missiles hit an infrastructure object which led to momentary blackout. You can see on this drawing how the traffic flowed. The huge Kharkiv region remained almost without communication, I will say from my experience of my company how to change five different routes from Kiev to Lviv within one day.
Next one is a dangerous working conditions. The QR code of the slide you can see a link to the Google folder where you will find many photos from the life of the telecommunications industry of Ukraine during the war. This is a sad story from this photo, from the one of the biggest Ukrainian telecom operator, the driver died and others were seriously injured.
And this is a trunk operator at Racom [phonetic]. The photo was taken in Kherson region. The employees were injured and these pieces of metal were removed from the bodies of the engineers.
We risk our lives every day so that the Ukrainian segment of Internet will not fragmented from the world network and our people stay connected.
And next is a cyber attacks. I received this official statistic from a computer emergency response team of Ukraine at the State service of special communication and information protection of Ukraine. Almost three times more cyber incidents from registered in the year 2022 than 2021. More than a quarter of all cyber incidents and cyber attacks were directed against the government, local authorities and other sectors of critical infrastructure.
Russian cyber attacks in Ukraine are often synchronised with ground and air strikes on civilian and critical infrastructure and in the world, Russia cyber attacks have become part of their information and propaganda machine. Since the beginning of the war, attackers supported by the Russian government have aggressively attacked not only Ukrainian but neighbour member states too.
I saw two excellent articles that highlighted the real situation in Ukraine with IP resources. As of February 2022, AS number 2197 were significant to networks in Ukraine. In the last twelve months more than 100 networks registration have moved out of Ukraine. More than 40 of these are now registered in Russia. This is the biggest line.
Some networks disappeared from the global routing table. For example, on March 20, 2022, AS numbers 6,712 from RTV based in Mariupol disappeared from the global routing table. It returned after months but with only half of the address space that it was using.
Other large networks from the same region such as AS number 57864 megabit and AS number 35714 also disappeared around the same time but did not return.
Here is another greater article Internet services in Russia controlled part of eastern Ukraine have been provided through Russia transit providers for many years. But last year Internet resources in Europe changed in front of the world as if it were actually Russian.
And this origin have received a type of specification as a simulation where the Ukrainian residents of their region have been forced to accept all things, Russian language, currency, telephone numbers and of course Internet.
And of course another problem is traffic switching is one of the biggest problem in occupied territories, make turn off mobile communication from Ukraine operators and also switch fixed communication channels to the Crimea operate in Miranda. Everybody knows that Miranda, it's daughter of Frost Telecom. And the Ukrainian telecommunications severe is really unbreakable because we also have our guardian angels. It is a voluntary from here. If you have time to join the guardian angels to Ukraine telecoms sphere, I urge you to do it. Now Ukrainian delegation here because we are part of the international professional community and we have a unique experience that we are ready to share with you. Ukraine is strong, brave, unbreakable and we are ready to open it for you.
Thank you so much.
WOLFGANG TREMMEL: Thank you. We have time for questions, one or two. Okay.
KARIN AHL: Please keep Ukraine connected. Thank you.
WOLFGANG TREMMEL: Thank you very much. So the next one should be a remote presentation, so I'm setting on speaker. And the next one will be Florian Streibelt, are you with us?
FLORIAN STREIBELT: Yes:
WOLFGANG TREMMEL: Florian will be talking about DNS, IPv6 and how broken is everything.
FLORIAN STREIBELT: Hi. I am addicted to DNS. So, in today's lightning talk, I'm talking about two very light topics: IPv6 and DNS, of course. So, let's see how bad it can be.
Actually, when your client and recursive resolver are only able to use IPv6 and there is no IPv4 involved anywhere, that's what I call IPv6‑only in this talk, experience can be quite underwhelming. So what you see here is what you get when you plug your browser at the Wikipedia. However, when you are checking with some IPv6 testing page, you get a green check mark. And when you do a DNS look‑up from other machine you get IPv6 addresses, you can ping them and they even would serve to that site. So, where does it actually break? What's wrong there? It seems working, right; everything you do to debug is actually working.
So, of course, it's DNS and not IPv6, so again we prove it's DNS and never the network, but on a more serious note the issue here of course is that the name servers have no IPv6 addresses. And the interesting fact here is that the underlying issue was already described in as early as 2004. So, what better way to do as an academic is write software and crunch some numbers.
First, let's talk about DNS and research our DNS knowledge a little bit.
So, here you see the well known DNS tree that we all know from our text books with the root zone at the top. And what you have to know is for a domain or a zone to work it has to be known and configured in a parent zone, for the top level domain this would be the root zone, there it has to be configured so we can find it when we are looking for t and to resolve a host name for example mail example come one has to start at the root and follow along the edge of the trees walking from node to node. This works. Because the name servers of a zone will reply with DNS records of its child zone when the name requested is in or below the respective child stop. So, each zone will hold all information about itself but only in minimum about its child zones, this allows scaling and make each zone a similar sort of tools but also requires some cooperation. However, there are some caveats.
First of all, it's clear that each zone depends on the parent zone. It's a tree, after all, and that's where delegation has to happen.
Note that the this data has to mention the information in the child zones so we have to make sure that parent and child actually have the same information from the name server allocations. Additional, you'll see when using external name servers, you add additional dependencies, here on the right will depend on the availability of example.com and example.com will depend on dotcom. So example.net actually will depend on the dotcom TLD.
Also common configuration is to use name servers whose name servers are in bailiwick of their own zone. This means that I can see, for example, in one or two example.com, they are a fully qualified domain name and in the name of the zone itself. It leads to the problem that to obtain the addresses of these name servers one would have to connect to those servers. We have a circle of dependency and this is resolved by putting the IP addresses into the parent zone using usual AAAA records. This is a very common configuration and we will see later how this thing looks.
After all, we identified five root causes for failing IPv6 resolution, they include completely missing the AAAA records for the name servers, but also mismatch between parent and child. But please see the paper for more datasets on these five issues.
All of these misconfigurations are not mutually exclusive and they are not specific to IPv6 but as long as a resolution by IPv4 is working, these issues will have no practical impact. So, this makes them very hard to spot as the full resolution chain has to be verified using only IPv6 if you want to check for these issues.
And note, that one single misconfigured zone will of course break all its child zones below it in the tree.
Let me give one quick example how indirectly dependencies can make this even harder to spot. Here, we look at a perfectly configured example of com zone that has two external name servers in some dns.tld. The delegation from the parent zone dotcom is set up accurately as well. And also the name servers have AAAA records configured in their zone. However, the zone of the name servers from the DNS TLD uses yet another zone and this zone cannot use IPv6. So this breaks IPv6 resolution as the IPv6 addresses of the name servers cannot be retrieved from this indirect dependency.
Now, to check how the situation looks in the Internet we looked at passive DNS data, and I'll just glass over this very quickly here, it's all in the IPAM. Last October I sent 56 million requests to the Internet, so please check the paper for more datasets on that and there's also a link to longer recording of a longer version of this talk that are linked at the end of this presentation.
So, what did we see besides a lot of broken name servers implementations at the very convoluted DNS topology? Let me highlight the key findings.
From the zones that show some form of IPv6 infinite, that means we see some AAAA records. So we think they might be interested in having working IPv6. So, were these last August, almost 45 percent could not be resolved using only IPv6. So, they're not working if you have only IPv6 activity from the root to the zone. We are not zooming in on these and looking at the reasons why they are working. This is what we find here in the plot. For the majority of zones, the parent zones are already broken, so even if these zone would have configured perfectly correct and have AAAA records, you would never be able to resolve them because the parent or one of the parents isn't working with IPv6‑only.
So, why is this form of misconfiguration often overlooked? For example, had no Google records for quite some time over IPv6, we don't know if this was on purpose or not. So, the problem here is at the moment there is no real operational impact if IPv6 resolution is not working. All of the resolvers usually have IPv4 and DNS is specifically designed to be robust and. As long as at least one authoritative name server of a zone can be reached, the resolution will actually work. It also means that monitoring solutions have to take this into account not just for DNS by the way. So as we have seen in this example here, it's obviously not sufficient to monitor DNS as ability, but it seems that many solutions out there currently in use just look more sophisticated than this script but they are in reality not really.
So, let me already summarise.
What we did was we conducted a long time passive measurement study over the dataset of several years, and verified with active measurements as well. We did the root cause analysis for broken IPv6 delegation in IPv6‑only setting, so only IPv6 for everything. And confirmed all findings with the active measurements.
In August '22, there were 45 of the zones that we considered IPv6, are not resolvable using only IPv6. Most commonly, the zone names of all the parent zone name servers were not able to resolve.
And I was pretty fast with that. I am at the end of this talk. Happy to take questions, also please don't hesitate to send me a mail for questions, and there are links to the paper and to a longer recording. Thank you.
WOLFGANG TREMMEL: Thank you, Florian. Are there any questions?
Okay. And with that, a remainder, please rate the talks. Go to the website, rip 86.ripe.net, Plenary agenda, click the rate button and give your rating. We are looking at that.
Also, if you want to join the PC, Programme Committee, please put your name forward. Florian, thank you very much for the remote presentation, and I hope that we see you in person perhaps next time again.
FLORIAN STREIBELT: Of course, happy to be back.
WOLFGANG TREMMEL: With that reminder, there is another BoF today, so there is a BoF about best current operational practices, and after that there are drinks, and I hope to see you all there, and with that, thank you.