The Machine Learning Journey

Wednesday, February 18, 2015

Paywall en diarios Mexicanos y su efecto en su tráfico web.

Desde varios años yo he sido un asiduo lector de diarios. Tanto Mexicanos como extranjeros. Y uno de los periódicos que siempre disfrute mucho fue el Reforma.

Recuerdo como mi mama lo compraba todos los domingos por las tiras cómicas, y yo me entretenía leyendo la opinión en Templo Mayor y las andanzas de Germán Dehesa. El lunes yo esperaba ansioso los marcadores de los partidos de la NFL para ver como les había ido a mis queridos 49es. Recuerden, esto es mucho antes del Internet, y cuando el fútbol americano era un deporte de nicho.
Aun tengo un recorte del primer juego de la temporada de 1994 cuando los 49es ganaron el Super Tazon.

Conforme fui creciendo, aprecie su calidad periodística e integridad, me parecía un muy buen sitio para obtener opiniones francas y una bocanada de aire freso ante el oficialismo de periódicos como Excelsior o la izquierda indomable de la Jornada y Proceso.

Lamentablemente, el diario que antes se encargaba de formar las mentes de los Mexicanos, hoy se ha vuelto, por su propia mano, una diario de gente rica. Donde solo armado de una suscripción puede uno acceder al contenido, este modelo llamado de "paywall" en inglés.

Y en el caso de Reforma, es una pared absoluta e inamovible, mientras periódicos como el New York Times y el Wall Street Journal tienen modelos similares, ellos permiten una cantidad fija de artículos al mes (alrededor de 10). No solo eso, si das click en algún link de el NYT o el WSJ desde Google o Facebook, tienes acceso a la noticia, ya que comprenden la importancia de tener su contenido disponible en linea.

Los periódicos han sido sin lugar a dudas uno de los negocios que mas afectados se han visto por la revolución digital, pero eso no es algo nuevo, este tipo de modelos se veían venir desde principios del 2000.

Ante este cambio, muchos periódicos tuvieron que revolucionarse, y el NYT es hoy en día uno de los sitios mas visitados de Internet.

Reforma sin embargo, se ha quedado en un modelo de negocios de hace 20 años. Pretende obtener todo su dinero de las suscripciones, dejando así a la gran parte de las personas sin acceso a su información.

Y de esa forma, lo único que logra es dispararse en el pie. ¿Cual es el caso de tener redes sociales (Twitter, Facebook) si las personas no pueden acceder a la información que posteas en redes gratuitas?

Basta ver que en cuestiones de seguidores y de trafico, El Universal, un periódico que se ha mantenido gratis (con un modelo premium de suscripción) ha aplastado a Reforma en cuestión de presencia en la red. Diariamente Aristegui esta creando el discurso político del país, como la fuente de noticias de México mas seguida en Twitter y en Facebook (2 millones mas de seguidores que Reforma), una persona, en una cadena puede mas que Reforma.

Uno se tiene que hacer la siguiente pregunta? Que tipo de periódico quiere ser Reforma, un periódico con 200,000 de lectores en un país de 120 millones? Un periódico claramente elitista? Un periódico donde la ganancia monetaria obviamente tiene precedencia ante la distribución de las noticias?

No es muy difícil ver que esto se va a convertir en una profecía auto cumplida, donde el publico cautivo va a ser de derecha y conservador lo cual va a llevar al diario a contratar columnistas de derecha y conservadores.

Y con tantos jugadores apareciendo en el entorno virtual , como Animal Político y Sin Embargo; de seguir así, Reforma va a dejar de existir como una fuente legitima de noticias y va a pasar a ser una nota al pie.

Reforma tiene que cambiar su forma de hacer negocios, debe modernizarse, su anacronismo es solo símbolo del miedo al progreso y al cambio que tanto ha plagado mucho de México. Me encantaría compartir el diario dominical con mi hija, como lo hacia mi mama. Lamentablemente, dudo que vaya a ser Reforma.

Wednesday, February 4, 2015

Mexico And Machine Learning (The long due rant)

Spanish Version Follows

Go to your favorite search site (Bing, Google, Yahoo) and do the following search:

"Machine Learning Jobs", Mexico

Ready? No? I'll wait a bit longer........Ready? Cool.

As you can see, the job offers in Mexico that includes Machine Learning as a keyword are from 1 to nothing. This means that no one in Mexico thinks that analyzing the data they have is worth their money.

Telmex, the largest telephone company in the world is also one of the less developed, just try putting this search terms:

"Telmex", "Machine Learning"
"ATT""Machine Learning"
"NTT""Machine Learning"

These are the largest communication companies in Mexico, USA, and Japan. While ATT and NTT have research labs Telmex does not have a single research lab oriented to data analytics.

Let me remind you, Telemex is owned by the richest man in the world, and its net profit is well over ATT's net profit, yet, no commitment to research.

So, what can we expect if the largest company in Mexico doesn't do research at all. You end up with a plethora of start-ups (Pyme) that do not care about R&D either. Even thought Machine Learning is relatively cheap compared with other R&D overhead costs, they see no interest on analyzing the data.

It hurts me to see how most of the most innovative start-ups in Mexico are basically rediscovering what has already been done in the US 10 years earlier. How, due to the negligence to Science and Technology, most people that have PhDs have either to stick to Academia or leave the country altogether.

Not like scoring a job in Academia in Mexico is easy either. The process is extremely murky, and the whole establishment is extremely oligarchic.

When I tried applying to a Job at Mexico's top University, and after sending all my documentation, their answer was: "Thank you, for your application, but be aware that is extremely difficult to get in here, so do not have high hopes"......

Who does that? What kind of self-named world class institution even has that kind of response to job applicants? Hell, I've applied to Google and Microsoft and never got that kind of answer.

And all of that brings us to Machine Learning, since is really hard to score a Job in Academia, and there is virtually no R&D, new trends (like Machine Learning) take a real long time to enter Mexico, researchers in CS are still using decades old techniques and couldn't care less about doing state of the art research because..... tenure?

Anyway, that is how things are, luckily I know plenty of people who still have high hopes, and are trying to push innovation in a real sense.

Mexico y Machine Learning

English Version

Ve a tu sitio favorito de búsqueda (Bing, Google, Yahoo) y busca los siguientes términos:

"Trabajos en Machine Learning", México

Listo? Ya? Espero...... Ya quedo? Ok

Como pueden ver, la oferta de trabajo para alguien especializado en Machine Learning en México es prácticamente nula (Hay mucho Machine operator, claro, país manufacturero). Y que quiere decir esto? Que a nadie en México le interesa analizar sus datos, o que están dispuestos a pagar sumas extraordinarias por hacer outsourcing.

Telmex, la empresa mas grande de Telecomunicaciones en el mundo, carece de laboratorios de Investigación y Desarrollo, por que creen que NTT y ATT han estado usualmente a la vanguardia, por que saben que invertir en Investigación es una parte fundamental del desarrollo empresarial.

Recuerden que Telmex es del hombre mas rico del mundo (a veces, depende como se mueva la bolsa), así que no creo que le falte sensibilidad financiera, solo sentido común.

Y este modelo se traduce a todas las empresas chicas y medianas, donde se le da una importancia tremenda a la ganancia liquida, sin preocuparse de la innovación de su producto, la gran mayoría de PyMES en México parecen ser servicios técnicos glorificados.

Y la opción si uno tiene un doctorado en esta área, es la academia, sin embargo, conseguir un trabajo en una Universidad y poder realizar investigación en esta área es sumamente complicado.

Los trabajos en las mejores universidades son escasos, y sumamente politizados. Cuando trate de aplicar a la UNAM, el correo que me dieron de respuesta decía: Gracias por aplicar, pero entienda que es muy difícil conseguir un trabajo....... Quien da ese tipo de respuestas? Eso parece mas código de : Ya tenemos el candidato, solo lo anunciamos por compromiso. Por cierto, después de eso me ofrecieron puestos en 3 Universidades de Estados Unidos, así que no me puedo quejar.

Y que pasa entonces, con esta lenta rotación de personal, las ideas nuevas se quedan enterradas en pro del status quo, para que buscar nuevas áreas de investigación y estar a la vanguardia, si tal vez no vaya a publicar?

En mi opinión, la primer universidad en México que ofrezca un grado en Tecnología de Datos va a ser la que al final de la siguiente década tenga la vanguardia en innovación en México, y lamentablemente no parece que vaya a venir de la educación publica.

Friday, January 23, 2015

Links on Data Science, Research and ML for the day (January 22, 2014)

These are some interesting links for your day:

ArcPy and ArcGIS

While I'm at it (blog updates I mean), I might as well describe what I'm doing.

I'm currently learning what should be a cartographer/geologist wet dream. Is a piece of software called ArcGIS, which for what I have learned looks like a really fancy CAD oriented towards geologists.

It can definitely create beautiful maps, and while the way to learn it is less than optimal, it offers plenty of toolboxes as well as custom ones that you can import from other sources.

Right now, I'm just learning the basics and if anyone knows how to extract parts of maps using their Python interface, that would be most helpful.

ArcGIS sample screen, borrowed from: http://www.esri.com/news/arcuser/1012/graphics/blended_3-lg.jpg

Thursday, January 22, 2015

Amazing video built with NASA's incredible gigapixel image.

I've always been a sucker for high resolution videos and space imagery. So it comes to no surprise that I would just love the following video done using the latest super giga pixel image of the Andromeda Galaxy, our nearest neighbor, released by NASA

Enjoy in fullscreen by all means, and it is way better if you wear your headphones.

Wednesday, September 24, 2014

Why I just do not think Murphy's book is that good (Part 2, a case in point)

Well, first of all, this post is a bit rantish, but after talking with some people, it seemed just fair to explicitly put examples of why I think Bishop's Machine Learning Book book offers overall a better learning experience than Murphy's Machine Learning: A Probabilistic Perspective

I'll present two didactic experiments, and I will have the point of view of someone versed in Probability and Statistics to an undergrad level, but not that much with ML.

So the 1st experiment is to introduce the reader to the EM algorithm via Gaussian Mixture Models (GMM), every book does it, and every book has its strengths and disadvantages.

Just so you can follow, this topic starts in page 352 in Murphy's Book and in page 430 in Bishop's.

Ok, so to start off, Murphy never references the equation of the Mixture of Gaussians (granted, is 5 pages before, but still, you need a finger in that page going back and forth), while Bishop essentially restates the whole Mixture of Gaussians paradigm, which he already did 200 pages before, is essentially the same text, but he goes to the whole problem of restating notation, what each term means and how the likelihood is calculated. He does mention the fact that he already did it 200 pages before (by putting a reference to the equation), but he just goes ahead anyway. Murphy does not even goes to the problem of saying where are the likelihood equations.

Furthermore, Bishop uses GMM as a motivation for EM, while Murphy's follow the years old formula of Model - > Numerical Recipe to solve it, and beware the numerical recipe has zero notation, and you have to go through the whole book looking for it.

So first experiment, Bishop is the clear winner.

Second experiment, introduce the reader to a new topic, not in Bishop's (because that is supposedly one of Murphy's advantages). In this case Deep Learning, which is the last chapter on the book, page 1000.

Ok, so we are presented with equation 28.1:

$p(h_1,h_2,h_3,v|\theta) = \cdots$

Without any context text, that equation is just useless. What are $h_1, h_2, h_3$, in the right hand side of the equation they are multiplied by some $w$. Remember, I am a newbie in ML, I am not supposed to know directed graphical models at all. Furthermore, there is no explanation or motivation for the model at all.

A good reference book, would ate least direct you where those terms where first introduced. So I went to the Notation sections, where oh surprise! $h$ is never explained, neither is $v$, I mean there is a $v$ for nodes of a graph, and this is a directed graphical model, so that makes sense.

So $v$ is any node in the graph. just to be sure I will search in the book. I assume that the previous chapter (latent variable models for discrete data, page 950) has the notation at the beginning, since per definition a deep net might be latent model for discrete data (here I had to pick a bit on my ML expertise). Remember, we still have no idea what $h$ stands for or $w$.

Ok, so the first thing I notice, is that $v$ is actually words in a document, so it is not nodes, but words? I do not do NLP, what are the words in a Bag of Words supposed to be? Features? examples?
Why words, why not something more general?

But at least we know $v$ are words. We still do not know what $h$ or $w$ is.

So perhaps is even further back.

Graphical Models Structure Learning (page 909), obviously, no notation on the first few pages, let's go over the chapter, I think I saw an $h$. Yeii!! success!! Page 924 defined $h$ as hidden variables.

So $v$ are words and $h$ are hidden variables, right? Ok cool, wait.... I just bumped into page 988, and here it says that let's overwrite notation (just because, I guess) and now $v$ is actually the input (which are the visible layers by the way, hence the $v$). So $v$ went from being nodes, to words to input that was formerly known as $y$ (like Prince!!)

Now, we understand the left hand side of equation 28.1, oh yeah, we haven't defined the weights yet (yeah those pesky $w$ that I just defined in a single line) but I'm too tired to even try. They are weights that mark the importance of each hidden or visible node, and are the things we are trying to find (there! print it and paste it at the top of the book, a single line!!!)

We essentially backtracked two chapters and wasted a ton of time looking for a single line where he defined the variables for the first time, and if you missed it, like I did with page 988, you are essentially doomed to have a bad understanding of the topic.

Anyway, I guess that is off my chest and I promise, this is the last post on Murphy's book, I guess my take home message is: This is probably a good book if you already know machine learning (and familiarized with the particular flavor of their notation), but is by no means an "essential reference for practitioners of modern machine learning". The last thing a practitioner wants is to go through the entire book so he can implement EM. Specially today's practitioners, but that is a rant for another day.

At the end of the day, I enjoy teaching ML to people, and introducing these fascinating concepts to them and is very frustrating that the community endorses a book that does a very poor job at addressing this issue.

See ya