Open Finance Society

Financial analysis using SQL

I guess everyone knows/uses Excel and financial analysis is traditionally done in Excel. SQL-in-browser is a core analytical engine here. After talking to some early users, I think it can be helpful to showcase how to use SQL to do simple math based on the fundamental data here.

Excel vs SQL

There are obviously various difference. From analytical standpoint, 1 obvious difference is that Excel is row-and-column-agnostic, and you are free to drag cells horizontally and/or vertically to do SUM(...).

SQL as column-major calc engine

However, SQL is column-major, so when you do:

    SELECT a, b, (a+b) as sum FROM tbl WHERE a>10

SQL assumes a and b are columns and automatically iterate overall all rows as long as the WHERE condition is met. No need to drag anything here.

Now in the SQL Studio, you can paste the code and see quarterly Iphone sales revenue time series:

    SELECT cik_int, segment, item, fyrqtr, "value" as rev
    FROM 'funda_q_v1.parquet'
    WHERE cik_int  = 320193 AND segment = 'aapl:IPhoneMember'
    AND item = 'us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax'
    ORDER BY fyrqtr

Removing filtering conditions will get you more data.

Pivoting table in SQL

As you can see from the data above, the data on this website has "long panel" form (each row represents a company/segment/item/quarter) and there is just 1 "value" column for all accounting itmes.

Just like Excel, we can do long-to-wide pivoting: Get segment revenue over time for Apple as wide panel:

    WITH cte_pins AS (
        SELECT segment, fyrqtr, value
        FROM 'funda_q_v1.parquet'
        WHERE cik_int = 320193
        AND item = 'us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax'
        AND segment!=''
    )
    SELECT *
    FROM (PIVOT cte_pins ON segment USING ANY_VALUE(value) GROUP BY fyrqtr)

The PIVOT syntax is overall self-explanatory and you can ask your favorite LLM to write it up for you (I do it myself). The WITH-AS defines a Common Table Expression (CTE), which works like a temporary table in the middle of an Excel workflow.

Trailing 12mo calc in SQL

Analysts commonly look at trailing 12mo sum and it is also possible here. Example for T12M of iPhone revenue:

    SELECT cik_int, segment, item, fyrqtr, "value"  as rev,
    SUM("value") OVER (
        PARTITION BY cik_int, segment 
        ORDER BY fyrqtr 
        ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
    ) AS rev_t12m  
    FROM 'funda_q_v1.parquet'  
    WHERE cik_int = 320193 AND segment = 'aapl:IPhoneMember'  
    AND item = 'us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax'  
    ORDER BY fyrqtr

The PARTITION-ORDER syntax is called Window Function. It may look like a mounthful because it is future-proof and allows you to easily change your aggregation method without the need to drag or re-drag any cell or formula.

Conclusion

This blog shows that SQL is very expressive in table manipulation and you can perform various calc as your did in Excel. As you get more familar with the syntax, you can mix-and-match these components to create a more sophicliated workflow for your own use case.

Moreover, SQL is very scalable. The underlying data has million-level rows and you can run a query in a matter of seconds.