How data in Intrinio's Company Fundamentals API gets from a Filing to an API response.
Intrinio’s Fundamentals product provides a wealth of information about companies that file with the U.S. SEC. In fact, it can be quite overwhelming just how much data is available. This document lays out what each piece of data is, where it comes from, and how you can use it. But this requires a little background.
Filings
All publicly traded companies in the United States – even small companies – must file a variety of different reports to the Securities and Exchange Commission periodically. You can think of it like having to file additional tax returns, but more frequently and with a completely new set of rules. In fact, just like the IRS, the SEC has a slew of forms that are required under varying circumstances.
There are two of these reports, however, that stand out as being the most important for anyone trying to analyze how a company is doing financially: The 10-Q and 10-K.
Fiscal Periods
10-Q reports are filed for the first through third quarters of their fiscal year, and a 10-K is filed for the entire year (which includes the 4th quarter). The start and end dates are used to report financial data that can be used to determine the financial health of a company. For a lot of companies, the fiscal year follows the calendar year, so the reports look like this:
For some companies, fiscal quarters are not the same as calendar quarters. This is especially true of retailers, whose fiscal Q4 ends in January or February of the following calendar year (they use the well-named “retail calendar”). For example, Target Corp’s fiscal year ends on February 2, so its calendar of fiscal quarters looks a little different. Here’s what the fiscal periods might look like for a company whose fiscal year 2018 ends on February 2 (of 2019):
Retailers do this for a reason you might not suspect: it allows companies to have the same number of weekends and weekdays across years and quarters, so year-over-year and quarter-over-quarter comparisons are more accurate.
But it’s not just retailers - fiscal years are all over the map. Microsoft’s year ends in June, Adobe’s in November. A fair number of companies end their fiscal year on September 30, which corresponds with the United States' own fiscal calendar.
Ultimately, it doesn’t really matter all that much what the actual dates are. What really matters is that we have a contiguous collection of financials reports, and we can easily identify what we have.
What’s In A Filing
10-Qs and 10-Ks report the same type of information, just over different periods of time. Each gives a snapshot of the company’s financial heath. Reports include comparable information from periods in previous years. For example, here’s the first few lines of Apple’s 2019 Q2:
These reports are viewable as regular web pages at the SEC (in fact, that’s a screenshot). Each filing has a lot of information - there are statements for Cash Flows, Income, and a Balance Sheet, and a bunch of disclosures that may, for example, discuss currency risk or employee stock grants.
Until about 10 years ago, the only way you could get this data was reading the SEC's web page (or, if you were an investor in the company, you were actually mailed a physical copy of the report). Going through that data and trying to figure out if the company is doing better or worse required painstaking attention to detail while deciphering the report. Until…
XBRL
The SEC started requiring that companies augment their report with in a format called XBRL - Extensible Business Reporting Language, whose purpose is, according to XBRL.org, to increase the transparency and accessibility of business information by using a uniform format.
The promise of XBRL is that it gives us the ability to read the financial statements contained in filings directly into things like databases, spreadsheets, and even machine learning algorithms.
Before XBRL, this data was not really standardized. In the example above, Apple reports “Net Sales” for both products and services for several periods that are in the report. Other companies might report just “Sales” - does that mean net or gross sales? Or maybe they just use something else like “Sales Reported Period, Net.”
Reporting data in that way makes it really difficult to make comparisons. What if the wording changed between years or quarters? Even worse, it’s extremely difficult to compare across companies. For example, you might want to see if Walmart’s sales are increasing faster than Target’s. Before XBRL, you’d have to ensure that you were comparing things that are actually comparable.
Facts
The XBRL specification all starts with facts. A fact is just what it sounds like: it might be a bank balance as of a certain date, or, in Apple’s case something like, “Product Net sales of 46,565 million for the three months ended March 30, 2019” (which out of sheer coincidence happens to be the first entry on the example given earlier).
Without thinking too much about XBRL in particular, consider what it would take to describe a fact. More importantly, what would it take to describe a fact unambiguously. You could have a bank balance, but that depends on an exact date. Net sales occur over a period of time - maybe a quarter or a year. And it might be limited by something - say, only product sales. So you could say a fact has to have these properties:
- A value, “$46,565 million”
- A date (or date range), “Jan 1 - Mar 30 2019”
- A filter (“Products”)
- A concept (“Net Sales”)
There, in a nutshell (but with some fine tuning) is what a fact is in XBRL. Simple as pie.
Taxonomies
Or is it so simple? We still have the same problem using “Net Sales” as the concept. Who gets to decide what net sales actually means? As mentioned earlier, someone else might report “Net sales for the period” or something along those lines. Even worse, maybe someone at the company had fun with the spelling - what if it were “Net Sails”?
Humans reading through the report would get through it. We’d have a pretty good idea what the intent was, or maybe we’d be able to scan through the rest of the report and, taken in context, we’d have a pretty good idea what was meant by the term. Sure, there’s still a little bit of a gray area, but generally speaking, our knowledge, education, and experience lets us understand the meaning.
But that’s not so easy when it comes to computers, which are much less forgiving and less capable of inferring meaning without significant training. Imagine if you had to write some code that always correctly identified any form of “Net Sales,” even if it were misspelled or had extra descriptive info on it. Adding to the difficulty, when a new filing comes in, you might get a brand new way to spell it - maybe “Set Nails” -that had never been used before.
Fortunately, you don’t have to go through that. The SEC requires that all 10-Qs and 10-Ks use the US GAAP taxonomy. A taxonomy a dictionary that defines a standardized way to "tag" concepts, and the SEC requires you to use definitions from it. The US GAAP taxonomy is produced by the Financial Accounting Standards Board.
In the Apple example above, accountants at Apple assigned the $46,565 million value to the concept RevenueFromContractWithCustomerExcludingAssessedTax. There’s even a description of this concept:
Amount, excluding tax collected from customer, of revenue from satisfaction of performance obligation by transferring promised good or service to customer. Tax collected from customer is tax assessed by governmental authority that is both imposed on and concurrent with specific revenue-producing transaction, including, but not limited to, sales, use, value added and excise.
Well, that seems to be a lot more clear than “Net Sails.”
Taxonomies are a comprehensive, all-encompassing collection of concepts. The US GAAP taxonomy required by the SEC has more than 17,000 concepts in all, and it changes more than you might think each year. How detailed is it? Consider this concept’s description:
The hypothetical financial impact of a 20 percent adverse change of the discount rate on the fair value of transferor’s interests in transferred financial assets (including any servicing assets or servicing liabilities) as of the balance sheet date.
Yes, that’s US GAAP concept with the tag SensitivityAnalysisOfFairValueOfInterestsContinued
ToBeHeldByTransferorServicingAssetsOrLiabilities
ImpactOf20PercentAdverseChangeInDiscountRate.
Clearly there’s something for everyone.
Fundamentals
After that hopefully helpful but brief background on the SEC and XBRL, we make it to Fundamentals.
Given how detailed XBRL can be, it should be simple to just grab this data and use it, right? We should be able to say “Give me Apple’s latest Q2 net sales.” It’d be nice, but it turns out it's not really very easy to get usable data directly from XBRL filings. Believe it or not, there’s nothing in XBRL that says “this data belongs to 2019 Q1.” Values are tied to individual dates (“Cash and Equivalents,” for example) or durations (like “Net Sales”).
That’s where Intrinio comes in with Fundamentals. Think of a fundamental as a pre-filtered, sorted, and grouped bucket for related data.
Fundamentals have three useful pieces of information:
- The fiscal year
- The fiscal quarter
- The statement (balance sheet, income, or cash flow statement)
Intrinio’s API provides data in the “Big 3” statements - the Income Statement, Balance Sheet, and Cash Flow Statement. If you look at the title of the statement on a filing, you’ll see it varies. For example, the Apple income statement is called “CONDENSED CONSOLIDATED STATEMENTS OF OPERATIONS (Unaudited).” Intrinio uses machine learning to automatically identify these statements accurately, regardless of its title (and they vary a lot from company to company).
Think of fundamentals as a “container” for facts. In the case of Apple’s 2019 Q2 filing, the following fundamentals are created by Intrinio from the SEC filing:
This table is the first hint at some of the intelligence that Intrinio adds to a filing. Let’s take a look for now at the second row - remember, Apple’s “CONDENSED CONSOLIDATED STATEMENTS OF OPERATIONS (Unaudited)” has been identified as the income statement in the second quarter filing. Let’s look again at the first few rows (and columns) of that statement:
The columns that have the header “Three Months Ended” are used to create fundamentals for statements in the filing. Looking again at the fundamentals table above for “income_statement” rows, “Three Months Ended March 30, 2019” creates a fundamental for 2019 Q2, and “Three Months Ended March 31, 2018” creates one for 2018 Q2.
So why aren’t the “Six Month Ended” used? Since they are Q2YTD values, they could be used as fundamentals. However, not all companies report year to date values on the income statement, so Intrinio calculates them directly. In addition, Intrinio calculates Q2TTM which is the trailing twelve months. Since Q2YTD and Q2TTM are calculated, and not derived directly from the statement, they are created with a statement code of “calculations”.
Financials
All of this work to create fundamentals from filings doesn’t give us a lot of meaningful information until the actual values are added in. Consider the highlighted columns above. There are six numbers highlighted that represent Apple’s Products & Services sales (plus total net sales) for 2018 and 2019 Q2.
As previously discussed, these numbers are provided in the XBRL filing as facts. They have a date (or date range) attached to them and an XBRL concept. But using XBRL directly, there’s no way to know that these values represent fiscal Q2 2018 or 2019.
Using a spreadsheet analogy, the value at the intersection of a fundamental (a column, “three months ended March 30 2019”) with a concept (row “Product Sales”) is a Financial. A financial for a particular company can be described as answering “What, When, and How Much?”
Reported Financials
Intrinio actually has two feeds for Financials: Reported and Standardized. As you might expect, Reported Financials provides unmodified data. Standardized Financials change the “What” in answer to the question “What, When, and How Much?” by using a uniform set of tags across all companies.
Reported Financials return the data exactly as reported by the company. Although U.S. companies are required to use the US GAAP taxonomy, they report their financials using whichever US GAAP concepts it chooses. Microsoft describes their revenue as simply Revenue, while Apple uses Net sales. Each company is similarly different with costs of that revenue. Underlying these descriptions, though is the same concept.
The tag in the API response for reported financials is just the XBRL concept used for this item, as filed.
The examples below are for 2018 Q1, but just the first two line items on the income statement for illustration.
Apple (AAPL): https://api-v2.intrinio.com/fundamentals/AAPL-income_statement-2018-Q1/reported_financials
Microsoft (MSFT): https://api-v2.intrinio.com/fundamentals/MSFT-income_statement-2018-Q1/reported_financials
Although this data comes unmodified from the XBRL filing, it’s worth noting how easy it is to query: Just supply the ticker, statement, and fiscal period, and you get the most recent data available.
A quick reminder, here’s the US GAAP definition of the revenue tags for both Microsoft and Apple, RevenueFromContractWithCustomerExcludingAssessedTax:
Amount, excluding tax collected from customer, of revenue from satisfaction of performance obligation by transferring promised good or service to customer. Tax collected from customer is tax assessed by governmental authority that is both imposed on and concurrent with specific revenue-producing transaction, including, but not limited to, sales, use, value added and excise.
Standardized Financials
The ability to query financial statements using simple fiscal periods removes one of the biggest pain points of XBRL. But it leaves something to be desired: publicly-traded companies are complex financial entities, and each has its own particularly unique way of capturing and reporting financial information.
For example, the accounting concepts used by a more traditional manufacturing company like Ford are much different than those used by Apple and Microsoft. Using the API again, here’s the top line for revenue for Ford:
https://api-v2.intrinio.com/fundamentals/F-income_statement-2018-Q1/reported_financials
The tag used by Apple and Microsoft says they excluded assessed tax in their revenue, while Ford just uses the Revenues tag. There’s also a US GAAP concept that includes assessed tax (seemingly the complement to the concept that Apple and Microsoft use). How are we supposed to know what Ford is doing with Assessed Tax? Is it something in between?
Here's the US GAAP definition of the Revenues concept:
Amount of revenue recognized from goods sold, services rendered, insurance premiums, or other activities that constitute an earning process. Includes, but is not limited to, investment and interest income before deduction of interest expense when recognized as a component of revenue, and sales and trading gain (loss).
Well, it doesn’t say anything about assessed tax, one way or another. This isn’t really a problem if you are looking at particular filing for a particular company, because there’s probably some other data that explains it later in the filing. But what if you are trying to compare revenue between quarters or years? Has Ford has always handled revenue reporting uniformly? Serious question: Has anyone? Ever?
Take it out one step further – what if you wanted to compare filings across companies, say, Apple’s revenue vs Ford’s revenue over time. As reported above, they are not entirely comparable. In the case of Ford, you’d have to try to make the Revenues concept be the functional equivalent of Apple’s RevenueFromContractWithCustomerExcludingAssessedTax (this just begs for a joke comparing Apples to Orange Mustangs).
Re-read the two definitions for the concepts each company is using to report revenue. Can you differentiate between them? If Apple used the same concept as Ford to report Revenue, would their reported total revenue value be the same?
How can we possibly compare these concepts?
Standardization of Concepts
The API documentation for Standardized Financials does a good job explaining the answer:
The primary purpose of standardized financials are to facilitate comparability across a single company’s fundamentals and across all companies fundamentals. For example, it is possible to compare total revenues between two companies as of a certain point in time, or within a single company across multiple time periods.
Let’s look at the top line revenue for Apple, Microsoft, and Ford using standardized rather than reported financials.
Apple (AAPL): https://api-v2.intrinio.com/fundamentals/AAPL-income_statement-2018-Q1/standardized_financials
Microsoft (MSFT): https://api-v2.intrinio.com/fundamentals/MSFT-income_statement-2018-Q1/standardized_financials
Ford (F): https://api-v2.intrinio.com/fundamentals/F-income_statement-2018-Q1/standardized_financials
Honestly, isn’t that a lot more pleasant? You have a consistent total revenue for all three companies, in a form that is directly comparable: the data tag operatingrevenue.
Instead of xbrl_tag, we now have data_tag . These tags are created and maintained by Intrinio and function as a simplified taxonomy that makes direct comparison possible.
Mapping between Reported and Standardized Financials
There are many concepts that actually convey the concept of revenue; the fact that Ford and Apple use two different concepts with slightly different meanings are a perfect example of this. Intrinio’s engine understands these subtleties and handles both as revenue.
That’s mapping in a nutshell: convert a company-reported concept (an XBRL concept, or xbrl_tag) into a uniform Intrinio concept (a data_tag).
It’s not quite that simple – Intrinio uses a set of sophisticated algorithms that combine machine learning with combinatorial analysis to convert from a company’s reported XBRL tag to Intrinio’s standardized data tag.
This process works on several levels, ensuring, for example, that Total Revenue properly aggregates more specific revenue by categories like Product and Service sales without double-counting.
Companies have also been increasingly using XBRL’s ability to dimensionalize reported data. Dimensions allow a company to disaggregate line items into whatever level of detail they choose. This is much more powerful than using concepts themselves as a differentiator, but it adds even more complexity.
For example, Apple might specifically report sales per product line, per region. The report might contain sales for “iPads in South America,” “iPhones in China,” and “Computers in Europe.” They can now report a single concept – “Total Sales” – but that concept only applies to the region and product line specified.
Apple would also have a “Total Sales” value with no dimensions – the aggregated value of each dimensionalized value.
This level of disaggregation, when provided, offers an extremely valuable level of insight into a company’s health, but it comes with the cost of added complexity. A major benefit of Intrinio’s standardized financials is the uniformity of the API across all companies.
Closing Thoughts
Filings at the SEC contain an incredible wealth of valuable data, especially when used to compare data over different time periods and across multiple companies and sectors.
However, that data is quite difficult to extract in a usable form, and even when extracted, not directly worthwhile for comparisons. Intrinio’s Fundamentals product solves both of these problems, creating an easy-to-use, accessible, and uniform source of data on the financial performance of publicly traded companies.
While this document explains the basics – how data gets from a filing at the SEC to a specific piece of JSON data produced by Intrinio’s API – it just scratches the surface of what's available. The standardization process differs if the company is considered a financial company (primary banks and investment companies) or an industrial company (everything else), for example. We’ve also only briefly mentioned filings that make significant use of XBRL’s dimensions (which are available in both reported and standardized financials).
But think for a moment what the API represents: I can tell you that the Operating Revenue for Apple, Microsoft, and Ford in 2018 Q1 was $88.293, $24.538, $41.959 billion respectively with three API calls. Try doing that with raw XBRL.