TIOBE Programming Community Index DefinitionSince there are many questions about the way the TIOBE index is assembled, a special page is devoted to its definition. RatingsThe ratings are calculated by counting hits of the most popular search engines. The search query that is used is +"<language> programming" The search query is executed for the regular Google, Google Blogs, MSN, Yahoo!, and YouTube web search for the last 12 months. The web site Alexa.com has been used to determine the most popular search engines. The number of hits determine the ratings of a language. The counted hits are normalized for each search engine for the first 50 languages. In other words, the first 50 languages together have a score of 100%. Let's define "hits50(SE)" as the sum of the number of hits for the first 50 languages for search engine SE and "hits(PL,SE)" as the number of hits for programming language PL for search engine SE, then the formal definition of the ratings becomes ((hits(PL,SE1)/hits50(SE1) + ... + hits(PL,SEn)/hits50(SEn))/n where n is the number of search engines used. YouTube only counts for 7%, the other search engines 23% for each. StatusBesides the rating of programming languages, there is also a status indicated in the TIOBE chart. Programming languages that have status "A" are considered to be mainstream languages. Status "A-" and "A--" indicate that a programming language is between status "A" and "B". If a programming language has a rating that is higher than 0.7% (yes, this number is arguable but we had to fix it somewhere) for at least 3 months it is rewarded status "A". The first two months the programming language will receive status "A--" and "A-" respectively. The opposite holds for languages that go from status "A" to status "B". So if a language had status "A" 2 months ago, a rating of "0.607%" last month and a rating of "0.687%" now, it will have status "A--". From a supportability point of view, it is strongly advised to stick to mainstream languages for industrial, mission-critical software systems. This is for three reasons:
Groupings and ExceptionsProgramming languages that are very similar are grouped together. Currently the maximum of the hits of the individual languages is taken into account when calculating the ratings of groupings. In the future we will do a better job and take the union (from mathematical set theory) of all the hits. There is a lot of discussion about what languages should be grouped together. It is very hard to have a definition that can be applied to all situations, so we just made a choice we thought reasonable. If you disagree, please notify us. Keep in mind that you shouldn't submit grouping/degrouping proposals just to get a higher rating ("take C and C++ together") or ungroup languages for tracking a minor variant ("decouple Mono from C#.NET"). The following table contains the definition of all groupings and exceptions.
Artifacts or ideas on improving the calculation of the TIOBE index will be received with gratitude (tpci@tiobe.com). |