Sunday, March 6, 2011

Networks and unnecessary complexity

Computer networks are one of the most important inventions of the previous century. Being able to convey information to any other computer in the network is a powerful ability.Note that having computers as the node is what truly distinguishes them from other kinds of communication networks: Every other kind of network requires a human at some point, either to transmit the information further (post system, telegrams) or to receive the information (telephones). The computer can be used both as a retransmitter , usually in the form of a router, as well as not only receive, but also store the information, so that the human user can retrieve it when he wishes to.

I was recently forced to study networks, using the book "Computer Networks" by Andrew S. Tanenbaum (an excellent starting level book on networks by the way). Although I am not an expert in networks, far from it, discussions with fellow students with network expertise convinced me that this book covers almost all known network aspects, even if at an introductory level. So I consider my network adventures a kind of BFS and frankly, do not wish to go further. If you are still wondering how can someone be "forced" to study , it is possible if you combine a vast amount of reading material for the exam with the  flu. I am sure most of you are familiar with the concept of spending time on a task that is irrelevant to your own goals, but it is mandatory.

I was not a stranger to computer networks, but this time there was a different feeling. Unintentionally, I was applying my computational complexity background on this rediscovered knowledge. Of course, I already knew some of the clashes between the two areas. For example, I am very well aware that big-oh notation is not tolerated in networks. Constants can make a great difference in the performance of a routing or collision management algorithm. This was not news to me and it hardly surprised me as I read chapter by chapter, the exam clock ticking away. However, another issue popped up.

Another, much more important goal of mine is my diploma project, necessary for my graduation. It involves the time hierarchy theorem, so I am reading one of the foundations of computational complexity, the 1965 paper by Hartmanis and Stearns "On the computational complexity of algorithms" . A very important result of the paper is the following:

 Given any computable problem, there is a function T(n) that upper bounds its complexity.

There is perhaps an even stronger statement to be added: That this function is at most the running time of its brute-force algorithm.

Even if the second sentence is not true, this is a very important result. Given any computable problem, we know that there is a maximum effort we have to make. For many problems, we have algorithms that allow us to reduce that effort by a lot. For other algorithms, we do not know if such algorithms, but we suspect that brute-force is the best way, that is one approach to view the P vs NP question.

And that is what surprised me with networks.  In many problems, unnecessary effort was put. First of all, there were way too many protocols, let's simplify that by saying too many algorithms, for exactly the same problem. I am trying to be as reasonable as possible here. For example,wireless transmission needs different algorithms than wired, simply because different issues arise. The same can be told for point to point communication and broadcasting. But we are talking about using different approaches with no specific advantages for the same problem. Those of you that have read the book, know that Tanenbaum admits to this fact and blames political and business agendas for that. I wholly agree with him on that fact.I feel that a handful of protocols could solve our networking problems, even with the different approaches , like connection-oriented vs connectionless. I believe that it is too late to reverse this trend. This beast of protocols is uncontrollabe and everyone has its say in how computers should communicate.

Even without the ocean of protocols, I still believe there is unnecessary complexity in networking today. The following example bugged me: My home computer is using the same Ethernet protocol that routers on a Google data center do. Using different algorithms for different input sizes is not uncommon. Circuit complexity is actually about using a different algorithm(circuit) for every input size. I am talking about something more trivial, having a bound on input size to decide which algorithm to use.I believe mergesort and quicksort are a common example: Usually, quicksort  is used only if the input size is equal or larger to 100, otherwise mergesort is used. Is it too much to ask that a connection from one computer to a router uses a different protocol from a large-scale enterprise LAN? Note that this can be part of the same protocol and does not require two different protocols, while it allows for better performance.

There are of course, other aspects of real life that I consider full of unneccesary complexity , although not always in the sense of computational complexity but morelike system complexity. You can read more about complexity and its different aspects in wikipedia for the time being : http://en.wikipedia.org/wiki/Complexity  . Perhaps I will have a post on this subject. I believe that state bureaucracy and legal procedures are fine examples of unnecessary complexity, although it is easy to fall in the trap of using inefficient procedures for difficult problems. As with most problems, there is a thin balance one has to maintain and it is not easy at all to do that.

I hope you enjoyed this post and that it was thought provoking. Do you know of any examples of unnecessary complexity? Do you believe it is possible to reverse a trend of a multitude of solutions and if so, do you have any specific approach?

PS: I hope I can make another post in a smaller interval, the start of the spring semester did not allow me to blog as often as I would like.

2 comments:

  1. I think it's fine to have multiple algorithms; the problem is putting them all in a textbook (not a reference book) as though they're all crucially important for someone learning the subject. Consider the various different proofs of the infinitude of primes, the uncountability of the reals, the fundamental theorem of algebra. Of course, it's rare to find more than one of any of these in the same book!

    ReplyDelete
  2. Well, reflecting on proofs, it is good to have many as long as they use different approaches, so that we have a better understanding of the statement proven, why it is true etc. On textbooks of course, criteria for selecting a specific proof is its comprehension and how it relates to the rest of the material on the book.

    However, algorithms must also be practical to use in order to be useful. So a new protocol or algorithm, in my opinion, should justify its existence and how it offers something different (or more efficient) than what already exists.

    In the reverse direction, we have to trust in the good intentions and integrity of the author, so that he doesn't advertise one protocol over the other, especially when the given technologies are new. An author gains a lot of respect in my eyes when he justifies why his pedagogical approach, especially when multiple choices are available.

    ReplyDelete