logo about writing tools
teaching data blog

 

 

 

 


Data Variables

While the lament of investors decades ago might have been that they did not have access to the data or enough data to use in their analysis, the problem we face today is a different one. We are inundated with data and are not sure of what to do with it. The key is to convert the data into measures that you can use to create a narrative about a company and to value. It is with that objective in mind that I try to analyze the data and come up with my measures of risk, profitability, leverage and value. While most of these measures are used widely, I do create my own twists on them, reflecting my corporate finance/valuation views. Thus, my belief that accountants are wrong in their treatment of operating leases and R&D leads me to capitalize both numbers, which in turn, changes the operating income, invested capital and other derived measures for a company. The table below lists the different data variables that I will be reporting industry averages for and links to a document, where I explain how I estimate each of the numbers.

Industry Breakdown

Every service has its own break down of companies into sectors or industries and each is imperfect, partly because some companies are difficult to pigeonhole (For example, is Apple a smartphone, an electronics or an entertainment company?) and partly because of changes in the way businesses operate (Think of the online revolution and how it alterered industries.) In creating industry groupings though, you face a trade off. If you make the categories too broad (manufacturing, retailing), you may miss key differences across businesses. If you make them too narrow (smartphone manufacturing, candy retailing), you will end up with small sample sizes and businesses that cannot be easily separated from each other. I don't claim to have cracked the code on this one, but I have tried my best, given the raw data groupings that are provided to me, to break companies down into just over 100 industry groupings. If you use the averages that I report, you are probably curious about what companies are in each industry grouping. To help you in answering that question, I have a spreadsheet that includes the listing of industries and the companies in each one.

Regional Breakdown

When you go global with the data, you get the advantage of a huge database, but you may be missing key diifferences in both corporate finance measures and valuation metrics across regions. Consequently, I have tried to create regional breakdowns of the data, reflecting partly conventional practice and partly my biases. The table below summarizes the most recent regional breakdown, with the number of countries in each one:

Region Number of firms
United States 7330
Europe (EU, UK, Switzerland & Scandinavia) 6655
Japan

3679

Emerging Markets (Asia, Latin America, Eastern Europe, Mid East and Africa), with a further breakdown for India & China.

Total: 20578

  1. China: 5008
  2. India: 3432
Australia, New Zealand & Canada 4436
Global 42678

I know that some of you would like information on just the companies in your country, and while I provide averages for key variables by country, I don't break down every data set by country.

Company Lookup

  1. Look up a company: In some cases, you may be wondering what industry and region I have grouped a specific company into and to address that, I have created a spreadsheet, with a pull down pane to look up an individual company. It is not the most flexible or efficient way to do this, I am sure, but it is the quickest.
  2. Company-specific data: Until last year, I had company specific data on key variables available for download as a large spreadsheet. Unfortunately, I cannot continue with that practice because the data providers believe that I am improperly sharing proprietary data. Since I used derived variables, rather than raw data dumps, I disagree, but this is one of those cases where discretion has to be the better part of valor until I can find a data provider who is comfortable with my data sharing.