Episode 1: Introduction to Network Science or: How to find the submarines
עודכן: 14 בספט׳ 2022
What's the rumpus :) I'm Asaf Shapira and this is NETfrix. In this podcast we will talk about the magical field of network science, graph theory, social network analysis or SNA and everything in between. And that's a lot… I have been practicing network analysis for years and also founded the first podcast in Israel to tackle the fascinating world of network science.
Maybe it's my Israeli upbringing, but I find network analysis much like Krav Maga.
Krav Maga is a fighting style that was developed in the Israeli military and uses the best practices and techniques from many different martial arts.
So is network science to the many other fields that are associated with it such as statistics, mathematics, physics and social sciences, making it an essential field to understand our world and our data.
And like Krav Maga, if you know how to analyze your network – no one will mess with you.
I've initially planned to start by saying that we live in a networked world, but it's such a cliché talking about how inter-connected the world is. On the other hand, you know what they say about clichés: It's an unsexy state of the truth. And if it's true that the world is a network, well then, not understanding networks is borderline negligence, so good for you! Let's begin.
Data science is a hot topic these days and many of its followers fall into two categories: Those who deal with networks and those who do not yet understand that they deal with networks. From cyber defense or attacks, smart city projects and urban planning, through product recommendation, language analysis, gaming networks and more. We will explain the advantages of looking at our world and our data through the network perspective and how constructing our data as a network allows insights into many of the issues that concern us.
The same is true in the field of machine learning. Treating data as a network rather than a collection of features, allows for the improvement of classification and the extraction of many other important features.
But what proves to be a higher challenge for the data scientist or analyst is explainability, that is, the ability to explain the results of the analysis. Analysts are not just data analysts. They are also the storytellers of the 21st century.
In the distant past, if the old-version analyst had seen an erupting volcano, then he could have explained it by arguing that the fire god was angry. Then the chief's analyst (pun intended) would prioritize society's resources in favor of sacrifices to calm the upset god.
Today we're expected to rely more heavily on data, but the need for a story remains the same. That is, why do things happen?
The beauty behind looking at the world through network glasses is that it allows us to see unfamiliar dimensions in the data on the one hand , and on the other, it makes it easier for us to create a narrative that is easy to mediate even for those who pay us our salary.
Another interesting field that uses network analysis is neuroscience: the brain is one of the most fascinating and mysterious networks. In-light of increasing resolutions and
improvements in brain mapping, the use of Social Network Analysis (or SNA) in the field has been gaining momentum in recent years. Advanced sensors, such as FMRI and more, allow us to understand the brain as a network and learn through it about this complex system.
Biology is another field that exploits network science, whether to understand ecosystems and interactions between elements or to study the spread of diseases. Epidemiology became quite relevant these days, as Covid19 and contact tracing network goes hand in unwashed hand. And when we lack in this field – we can sadly feel it. Originally, this episode was created in early 2020 when Covid-19 was as exotic as bat soup. Up to date episodes of this podcast will shed more light on this subject, from a network's point of view.
There is also a lot of network research in the field of physics - but no one understands physics so we will continue. The use of network analysis is not limited to exact sciences only.
In fact, the field of network research owes much to the fields of sociology & anthropology that were among the first to make use of network tools. Networks are also applied in history research, for example, Dr. Harel Chorev, from Tel Aviv University, showed how, with the help of SNA, it is possible to understand political and social success in a fascinating study published in 2019.
And if network analysis is relevant to politics and society, then it's most certainly relevant to organizations. ONA, or Organizational Network Analysis, is used by managers to better understand their organization. Through network glasses they can examine its function and how it really works, not just as it reads out in the brochure, and of course, use it to locate its weaknesses and strengths.
Weaknesses and strengths of systems can be translated to vulnerabilities and gravity centers, which are what cyber and intelligence organizations are looking for in the systems they surveil. In my humble opinion, there is no better tool for understanding, or even better, dismantling, a rival system than network analysis. SNA allows you to do it quickly, powerfully and elegantly. From a quick understanding of the enemy's structure and intents to its centers of gravity, i.e. the desired targets.
In the U.S. military, for example, after years of using SNA in campaigns in Iraq and Afghanistan, network analysis even became part of the military formal doctrine in 2016. SNA is also used in the world of advertising, for example by using influencers to push products no one needs to innocent customers.
And last, but not least, computer science.
I have met many times with Computer Science students that approached me after my lecture to say: Listen, I've learned about the theories you talked about, but until the lecture I didn’t have a clue how to apply them.
So, for all those practicing those fields and many more, this podcast is for you.
We will learn about how networks are built and what rules govern them. How to locate key players in the network, what applications to use, analysis tips and more. Through real world examples (and maybe some fictional ones) we will gain insights that analysts, researchers and data scientists often reach only after a long hands-on experience.
But apart from its many applications, there is something in network science that makes people fall in love with it:
For example, when I was in a meetup about graph theory recently, the data scientist who gave the opening speech spoke of how much she enjoyed researching graphs in the academia but only when she got out to the real world, did she surprisingly discover that it also has practical uses. On a side note, the speakers in the meetup have shown interesting examples of insights that can be gained from constructing data as a network, whether it is in consumer product classification or in shipping research.
What can I say, analyzing networks is also fun.
But why? That's because network science is not just another tool we can use. It's an eye-opening way of looking at reality. Seeing reality through network eyes allows us to understand its complexities, to understand the logic behind it and to find the centers of gravity and what drives it. And if you don't believe me, I suggest you go back and read the entire "Dirk Gently" series by Douglas Adams. Or for those of you with TLDR syndrome I'll sum it up: "Guys, everything is connected".
The reason we have a hard time understanding it is that often we, and the organizations we work in, rely too heavily on basic intuitions which tends to be linear by nature.
In linear thinking, we want to advance toward our goal in a straight-forward fast manner. Why? Because it is the most simple and intuitive way. But this method makes it difficult for us to understand complexities and makes us look for quick solutions to the wrong problems
To use the worn-out Forest metaphor, linear thinking tackles the forest one tree at a time, while aspiring to do it as fast as possible. A network approach, on the other hand, shows us the entire forest and points to the important trees. In fact, it can show us if we're in the right forest to begin with. This usually doesn’t even bother linear thinkers. These guys will cut down anything.
The challenge of assimilating network thinking in organizations can be illustrated by the following historical story.
Spoiler alert: I guess anyone that has worked in a big organization would find this story tragically familiar.
In World War II, the most significant challenge facing Britain was undoubtedly the threat of German submarines. Britain is an island and the sea is its lifeline. What can be done when the submarines emerge out of nowhere, attack and retreat, and every ship that goes out to sea is under threat?
In order to find a solution, the admiral of the British fleet decided to hold a workshop on the subject titled: "Out of the Box". As part of the open-mindedness spirit of the workshop, the participants tried to drill down to the basic components of the problems they face. In this context, the intelligence officer who gave the initial brief, explained the necessary conditions for operating submarines. A key factor he mentioned was that they are required to operate in an aquatic environment. Suddenly, a light bulb lit over the head of one of the participants. He was a young and energetic lieutenant, who was sent to the workshop against his will, in order to represent "boots-on-the-ground" way of thinking. "Well," thought the Lt. to himself, "Here's something we can do instead of blabbering on in workshops. No water - no submarines."
And he followed through.
Under the catchy slogan "No water-No submarines," he lined up his men on the beach, and equipped with shovels, they began to take out seawater. He informed his commanders of the activity, and they invited the senior command of the navy to come and check out for themselves the down-to-earth initiative that originated from the men on the field. The senior command was very impressed with the initiative and charisma of the young company commander and appointed him battalion commander. At the same time, they called the British academy to come down from its ivory tower and help Britain battle the sea, literally.
One of the scientists who responded to the call, told himself that he must go down to the field to fully understand the problem. He arrived at the company base and was amazed to see a hundred people standing on the shore, heaving water from the ocean.
-"What are you doing?" He asked one of the soldiers in amazement
-"Taking out water", the soldier replied briefly
-"Yes, but why?" -"You'll have to talk to the officer about this."
The scientist approached the officer and asked: "Why are you taking out water?"
The officer replied: "What do you mean why? To empty the ocean, that's why."
-"But why empty the ocean? " -"Ah.. I don’t know. You'll have to talk to the previous CO about this."
The scientist went to the office of the former lieutenant, now a battalion commander, and questioned him about the strange activity he had seen.
"Oh, of course." Said the new battalion commander, "It's to find those damn submarines. No water - no submarines. They will stand there like ducks in a range I tell you."
"Submarines?!" The scientist was bewildered "Your problem is finding the submarines?! So why didn’t you say so?"
The scientist rushed to his lab and came back after two weeks with a pair of antennas, a computer, lots of cables and motivation and started setting everything up on the beach alongside the vigorous heavers.
What is this? Asked the damp soldiers as they approached the scientist.
This is a radar, the scientist said proudly. See?
"Click here to turn on the computer, here you need to aim the antennas to collect the data. The antennas receive and transmit the position of the submarines. Then sort the results here and there. Present the map interface by pressing F11 like so and, voila! See all these blips in the center? That's the submarines!"
Cool! Said the soldiers to each other as they shuffled back in the mud, mumbling in praise of the scientist who had proudly returned home to his wife, to tell her how he had single-handedly saved Britain.
After two weeks when he returned to the shore to see what's what, he saw that another row of heaving soldiers had been added to the former company soldiers that kept heaving seawater from the ocean. The antennas rusted slowly in the cold wind.
"What's going on here?" Cried the scientist in frustration. The officer at the scene approached him and explained: "Look, the things you've shown us are really nice, but it looks quite complicated, turn on here, click there. Besides, while we work on your system, who will empty the ocean? The idea is nice theoretically, but in the meantime, headquarters sent us reinforcements and we need to brief and train them, so let's talk about it next month, okay?"
The scientist returned home frustrated and ranted on to his neighbor about the day's events. The neighbor was a senior officer in the reserve and was not at all surprised to hear what had happened. It's the army, he said, you need a plain and simple approach. Don't worry. I've got this.
The neighbor arrived at the beach the next day with 2 bulldozers. Each bulldozer lowered its U-blade into the sea and picked up gallons of water to shore with great noise and ease.
The soldiers and officers in the company surrounded the twinkling yellow bulldozers with admiration. Wow, said the new lieutenant, it's doing the job of an entire company! The senior command was immediately sent for to see for themselves the newest invention and they rushed to get there.
"Very nice," the generals told the neighbor, "But what's the concept behind it?"
The neighbor didn't even blink: "This is Artificial Intelligence," he explained to the generals. "It sees what the soldiers are doing and does it faster and more efficiently".
Amazing! They answered him. We'll take 20.
Thus, the neighbor opened a bulldozer factory and was knighted "sir" for his contribution to Britain's defense. Honestly, it sounds like a dubious story, but that's exactly what happened in Chicago when the Chicago Police tried to crack down on shootings in the city. To do this, the Chicago police resorted to two methods: first the linear method and second the network method. Behind the linear method was the idea of giving scores to each resident, based on features that are related to shooting, in order to determine the level of danger each person presents. The output of the machine learning algorithm they used was a list of 400,000 people which is no small percentage of the Chicago population.
The problem with this method was that there was no method. The features changed and were updated all the time and so the thing became an irrelevant black box. What should a cop do with someone who got a high score on the list? What type of treatment does he require? What does the guy have to do with the shooting that took place? The output of the method was out of context and lacked in explainabilty and therefore did not allow for effective measures to be taken. So, it comes as no surprise that following an arbitrary list, came arbitrary arrests of the top scoring persons in the list after each shooting incident.
What the Chicago police actually tried to do here was to locate the submarines by giving a "submarine score" to each gallon of seawater in the ocean.
On the other hand, when they sought to address the problem with the network method, they mapped the connections between criminals. This enabled them to identify not only the leading figures involved in the shooting, but also the context and environment in which they operate. It even helped them to predict who were at risk of being shot, even if they were not yet physically involved in shootings.
This way, Chicago Police Department located the submarines by using network laws and network features as a radar. I'm sorry, but you cannot talk about police, linear thinking and network thinking without talking about the TV series "the wire".
If you have not seen it yet, then stop everything –
There are just a few things out there more important than network analysis, and "The Wire" is perhaps one of them.
This iconic tv show tells the story of the fight against drug trade in Baltimore and I promise – no spoilers. In the series, a team of good detectives try to track down a network of drug dealers. Each time they find threads, which slowly weave a network made up of people, houses and cellphones through which they locate the centers of gravity in the trade network.
But instead of following up on the network's core, the high-command forces the team specifically and the police in general to advance toward the goal in a straight-forward fast manner. According to their understanding, police should act and raid places where there are known to be drugs, that is, the sellers' corners on the street. Of course, the problem is that the amount of drugs they stash there is small and the juvies running the corners are small time criminals. But even them know how to outsmart the police and hide the little drug they have from 5/0 raids. So, most of the raids usually don’t amount to anything, but appear as pro-active policing to the commanders.
The commanders' linear thinking is reflected in the concept of COMPSTAT that boasts of using crime data to dictate police activity. In this method, they mark areas where there is drug trafficking and try raiding them or show a presence there as a prevention, while ignoring the complex and dynamic reality. The traders, on the other hand, just wait for the cops to go or move somewhere else. This way the drug trade continues, and its infrastructure is untouched. Just as the British Navy tried to empty the ocean by a using shovels, so in "The Wire" the police tries to prevent drug trafficking by a wide presence of police in the city.
In contrast to the linear method that looks at the drug problem bottom-up, and focuses on the trees, the network method allows us to look at the problem top-down to show us the forest and produce context to the data.
But there's no magic. There is a catch here and the catch is that network science is not intuitive. This does not mean that network principles are complicated, it just means that we are not used to this way of thinking.
These days, we experience first-hand the implications of linear thinking when we see how governments deal with Covid19. A pandemic is a network, in this case, a contact network. And to dismantle a network, you need to map it so you can use network laws against it. What happens instead is we apply linear thinking and try to test as much trees as we can, hoping to test the entire forest. But the difference is that in this case, the trees (or submarines) keep moving and getting infected.
That's why there will be an episode on this podcast dedicated to COCID19 and network science.
For those who do want to find their submarines, SNA can serve as an excellent radar of reality.
During my service, I studied Islam at the academy. My motivation was that if I'll understand "true Islam" in depth, I will better understand the reality Jihadic movements act in. My innocence was taken from me in the first lesson by Dr. Leah Kinberg when she explained that there is no such thing "true Islam" and as there are a billion Muslims, there are a billion ways to understand what Islam is. What a bummer. I was gravely disappointed.
In the rest of my studies I've learned a lot and got great tools to understand reality, but it was still frustrating - reality is complex and as a soldier, I was looking for a way to simplify it. We do not have time for complexity – It's the army for God's sake!
And then came the world of machine learning ... Ah.. the potential. The feeling that there are hidden patterns in the data, some elusive truth that's just waiting for the right algorithm to expose it ... I got chills.
Time after time, experiment after experiment, with the help of known figures and with some unknown ones, we tried to find those elusive patterns and …. we did not find any.
All those black boxes we've produced gave us the answer "42" and now, go figure.
By the way, machine learning is great for some things and less so for other things. Understanding the network field helps, among other things, to understand the advantages and limitations of machine learning and also how to improve it, but we will talk about this in another episode.
In the meantime, I did not despair. I knew that humans are smarter than any computer. I read it in a book. I told myself that if the computer could not find the laws of the universe and everything else, then we - the intelligence officers – who, more or less, understand the world - would formulate the laws ourselves. You can guess how this turned out.
As the saying goes: military intelligence is a contradiction in terms.
This was my eye-opening experience that led me to the understanding that we understand nothing. We have no idea how our data behaves and hence how the world operates.
Then, in a moment of weakness, I was reminded of my meetings with Dr. Yaniv Altshuler or as we affectionately called him "the MIT guy". Dr Altshuler was talking about something called graph theory and Social Network Analysis and how one can analyze the data using the network's universal rules or something like that. I felt I had nothing to lose, so together with Daniel "Rambo" Rambishevsky who later joined Google and Dror Goldin who later joined Facebook, we began to explore in depth the world of networks, graph theory and SNA, without leaving a paper unread or an algorithm untested.
And it was amazing.
Some of you might remember the trend in the 90's where people stared at a blurred picture and after a moment of concentration and changing eye focus suddenly an amazing 3D image was revealed. So, I guess it felt that way. I say I'm just guessing, because I never was able to see anything in those damn pictures.
But the most amazing thing was discovering how misleading our intuitions can be.
As a person who is right all the time, it was refreshing to see how admitting to mistakes and to ignorance frees us to find the next big thing instead of getting bogged down in a sand trap...So, let me stop here and emphasize that the podcast will not deal with New Age and 2cents philosophies but with graph theory, network analysis and SNA (Social Network Analysis).
The term social is a bit misleading, since although there are many applications of graph theory in social networks, the SNA principles are not limited to social networks alone.
When we think of the term "Social network" we usually think of Facebook, Twitter and the like, but the Internet itself is a Social network and so is a network between servers and even a soccer game.
So, let's define what a network (or graph) is: a graph and a network are synonymous.
A network consists of two things: Nodes (also referred to as vertices or players in the network) and edges (the ties, links or connections). Each edge or tie also has a measure of strength, called weight. For example, if I sent 3 messages to a friend and he sent me 3 back then our relationship weight is 6. As can be understood from the example, there can be relationships that are reciprocal and there are relationships that only go one-way. When we consider the direction of the relationship in a network it is called a "directed network". And when we ignore the direction of the connection, it is called an "undirected network".
And the thing to remember is that anything that can be connected via a link or an edge is a network. On Facebook it can be a friendship between two users. On Twitter it can be who retweeted whom. On the Internet it can be which site points to which site. On a computer network, which server is connected to which server. In team sports, like soccer, who passed to whom is also a network. Speaking of soccer, I can't see the attraction in it. I'm more of a football fan (Green-Bay Packers anyone?), so I beg the world's forgiveness, and will compensate by dedicating the future "moneyball" episode on this podcast to networks in soccer. As long I'm not forced to watch. Another cool example for constructing data in a form of a network was by Omer Koren, the founder of Webiks, who turned a radio station playlist into a network .The idea behind it was that the singers are related if they are played before or after one another, cause you rarely hear Jay-Z after Leonard Cohen.
But why go to all that trouble of turning our datasets into networks? That's because networks allow us to acquire more knowledge without having to add more information.
For beginners in the field, the very visualization of the data as a network allows for "situational awareness" and is a fertile ground for drawing conclusions. Often, we can see this on television when a tormented detective stands in front of a cork board with pictures of those involved in the scheme with pins and wires connecting them.
This type of analysis is called link analysis and it's nice for very small networks and a sense of satisfaction in detective series, but network science offers us much more, especially when we handle Big Data. In this case, we'll need to adopt advanced technics and a different way of thinking and that's where network analysis, or SNA, kicks in and let us in to all the good stuff. And here comes the secret ingredient – all networks follow rules. Universal rules.
It does not matter if it is a Facebook network, a criminal network, a server network or the neural network in the brain: the same rules apply to every network. It is these laws that allow us to analyze, research, classify, disassemble or control the network, any network, efficiently and effectively.
Equally interesting is how these laws were discovered, how they are reflected in the various aspects of the network and in the world in general, and how they came to be. All of this and more will be covered in the following episodes.
So, remember – with great power comes great responsibility, and network analysis is as powerful as it gets. And it's also kinda fun.
A few ending remarks:
First all, a tip for our listeners: Advanced episodes often rely on acquired knowledge from previous episodes, so it is recommended to listen to the episodes by order. Except this one. Feel free to skip it.
This podcast was originally in Hebrew, making it the first Jewish podcast dedicated to network science. But even outside of Israel I haven't come across any other podcast in this field. Some may have done an episode about some graph related issues, but I didn't find another podcast dedicated to network science. So, if you find another, please let me know. I would love to sue them. And lastly - Due Diligence. For reasons I cannot elaborate upon, I will be giving Facebook a lot of credit. Sorry.
Did you enjoy, and want to share? Have you suffered and you do not want to suffer alone?
Tell your friends or rate us here. Thank you! Much appreciated!
The music is courtesy of Compile band. Check them out!
See you in the next episode of NETfrix (: