CRSP US Stock Databases

About the database

CRSP is renowned for its expertise in building and maintaining historical, academic, research-quality stock market databases. The CRSP US Stock Databases provide a unique research source characterized by unmatched breadth and depth. They include CRSP’s permanent identifiers allowing for clean and accurate backtesting, research utilizing time-series and corporate event data, performance measurement, benchmarking, and securities analysis.

The CRSP US Stock Databases contain daily and monthly market and corporate action data for over 32,000 active and inactive securities with primary listings on the NYSE, NYSE American, NASDAQ, NYSE Arca and Bats exchanges and include CRSP broad market indexes. CRSP databases are characterized by their comprehensive corporate action information and highly accurate total return calculations.

New Flat File Format 2.0 (CIZ) is Here!


CRSP has introduced a new Flat File Format 2.0 (CIZ) for the CRSP US 1925 and 1962 Stock and Stock with Index Databases. All Stock and Index subscribers – monthly, quarterly, and annual – received this premiere data cut. Click here for more information. As was true for the Flat File Format 1.0 (SIZ), the files are available in SAS, ASCII and R formats.

Click on the Documentation Tab below for new resources. 

Please email support@crsp.org and indicate your preferred file format. If we do not receive your preferred file format, your organization will not have access to the new Flat File Format 2.0 (CIZ).

Tabs

Knowledge Base

Citigroup volume has missing values for four dates

There are four dates where the daily trading volume for Citigroup, PERMNO 70519, Ticker C, exceeds our database’s maximum value (2147483648). Instead of inserting a false value into the database, we have listed the volumes for these dates as -99 (missing). The true trading volume values for those dates:

Volume
20090805 2674463281
20091217 3772638437
20091218 2813697156
20101207 3267829406

How can a security have a negative dividend amount?

There are four PERMNOs in the CRSP security universe with negative dividend amounts. In the 1920s, some companies required the payment of assessment fees from their shareholders. These assessment fees are recorded as negative dividend amounts and represent an obligation of the shareholders to make payments.  Two securities, Chicago Milwaukee St. Paul & Pacific (PERMNO 11228) and Wallace Murray Corp (PERMNO 15229) fall into that category. Two other securities with negative dividend amounts are more recent:  British Airways PLC (PERMNO 71589) and BT Group PLC (PERMNO 66835).  These amounts have been verified as installment payments. 

CHICAGO MILWAUKEE ST PAUL & PAC
SIC: 400 (Railroads)

PERMNO DISTCD DIVAMT EXDT
11228 2844 -32.0600014 3/2/1928

WALLACE MURRAY CORP
SIC: 3431 (Heating Equipment, Except Electric And Warm Air)

PERMNO DISTCD DIVAMT EXDT
15229 3260 -10.0000000 1/27/1927

 

BRITISH AIRWAYS PLC
Share code: 36
SIC: 4512 (Air Transportation Scheduled)

PERMNO DISTCD DIVAMT EXDT
71589 1282 -9.9200001 8/12/1987

B T GROUP PLC
SIC: 4813 (Communications Services, Not Elsewhere Classified)

PERMNO DISTCD DIVAMT EXDT
66835 1280 -3.1080000  6/3/1985
66835 1280 -2.7387900  4/2/1986

 

How are Standard Industrial Classification (SIC) codes collected?

CRSP does not actively assign SIC codes. We obtain NASDAQ SIC codes directly from the NASDAQ exchange and obtain NYSE, NYSE American, and ARCA SIC codes from Interactive Data (formerly FT Interactive Data). All data providers have told CRSP that they reference SEC documents as SIC code sources.

SIC codes can be useful for rough groupings of industries. Beyond that they should be used with caution—they are not assigned or reviewed with a strict procedure by any government agency, most large companies belong in multiple SIC codes, and they change over time. After the initial SIC code assignment when a company goes public, no government agency ever looks at that code or the company again—quite often a company reports its initial SIC code forever. There are cases in which companies would have obsolete SIC codes from the 1972-coding scheme in their SEC filings from the early 1990s.

The U.S. Department of Labor provides information on the 1987 SIC coding scheme athttp://www.osha.gov/oshstats/sicser.html.

Why does the history for PowerShares QQQ Trust only extend to April 12, 2007?

PowerShares QQQ Trust is an exchange traded fund; there are some special considerations regarding its association with another ETF security, NASDAQ 100 Trust Series I.

  • PowerShares QQQ Trust begins its history on April 12, 2007 and the NASDAQ 100 Trust Series ends its history the previous day, April 11, 2007.
  • The final distribution recorded for the NASDAQ 100 Trust Series lists PowerShares QQQ as its acquirer.
  • By CRSP's methodology, the security-level identifier, PERMNO, cannot be associated with multiple company-level identifiers, PERMCO. Thus, PowerShares QQQ cannot maintain the same security level identifier as the NASDAQ 100 Trust Series due to the securities having different PERMCO identifiers.
  • To merge the return histories of both securities replace the initial null value for PowerShares QQQ Trust with the delisting return of the NASDAQ 100 Trust Series I.

Example: 

Why do more NYSE-listed securities have missing closing bid & ask data during the period from 20001002 to 20010518?

In researching the bid and ask quotes, CRSP found a number of questionable data points. Registered market makers, usually off the primary listed exchange, periodically posted noncompetitive quotes that were captured as the closing bid and ask. These quotes had wide spreads, not typical of the day's trading activity. In many cases, spreads were intentionally set as wide as possible with bids close to a penny and asks close to double the market price. Alternate quotes that more closely represented the trading activity were not immediately available, so a missing value is reported pending further review by CRSP. Roughly 5 % of all bid and ask data points are missing. Over one-third of these occur between October 2000 and May 2001.

What is the meaning behind an exchange code of 0?

If an issue leaves an exchange that is covered by CRSP (NYSE, NYSE American, or NASDAQ) and later returns, the gap is marked in the Name History Array with an Exchange Code of 0. During this time, event data is not tracked and time series data is filled in with missing values.

CRSP will resume coverage of the security if it has primary listing on NYSE, NYSE American, or NASDAQ and otherwise fulfills our universe requirements for the stock database.  New PERMNO or PERMCO may be assigned, depending on events that took place during the off-CRSP period.

EXAMPLE:

PERMNO 83404

Namedt Enddt Cusip Ticker Company Name CLS SH Ex SIC
19960430 19990616 71892810 PHYX PHYSIOMETRIX INC 11   3 3840
19990617 20000611 71892810   PHYSIOMETRIX INC 11   0 3840
20000612 20010823 71892810 PHYX PHYSIOMETRIX INC 11   3 3840
20010824 20050729 71892810 PHYX PHYSIOMETRIX INC 11   3 3840

This security ceased trading on NASDAQ on June 16, 1999.  It resumed trading on NASDAQ on June 12, 2000.  The interval between these dates is represented by a name line with Exchange Code set to zero, and the Ticker field blank.

When a security is delisted and then begins trading again at a later date, does CRSP pick it up again or is it lost forever?

If an issue leaves an exchange that is covered by CRSP (NYSE, NYSE American, or NASDAQ) and later returns, the gap is marked in the Name History Array with an Exchange Code of 0. During this time, event data is not tracked and time series data is filled in with missing values.

CRSP will resume coverage of the security if it has primary listing on NYSE, NYSE American, or NASDAQ and otherwise fulfills our universe requirements for the stock database.  New PERMNO or PERMCO may be assigned, depending on events that took place during the off-CRSP period.

EXAMPLE:

PERMNO 83404

Namedt Enddt Cusip Ticker Company Name CLS SH Ex SIC
19960430 19990616 71892810 PHYX PHYSIOMETRIX INC 11   3 3840
19990617 20000611 71892810   PHYSIOMETRIX INC 11   0 3840
20000612 20010823 71892810 PHYX PHYSIOMETRIX INC 11   3 3840
20010824 20050729 71892810 PHYX PHYSIOMETRIX INC 11   3 3840

This security ceased trading on NASDAQ on June 16, 1999.  It resumed trading on NASDAQ on June 12, 2000.  The interval between these dates is represented by a name line with Exchange Code set to zero, and the Ticker field blank.

What is when-issued trading? How are issues trading when-issued handled in the CRSP stock database?

Securities trade on a when-issued basis when officially they have yet to be issued, but they trade as if they have been. These transactions are formally settled after the securities have been issued.

INCLUDED IN THE SUBSCRIBER VERSION OF THE STOCK DATABASE

"Reorganization" - All of the shares of an established security are trading when-issued due to some type of restructuring

EXCLUDED FROM THE SUBSCRIBER VERSION OF THE STOCK DATABASE

"Leading prices" - All of the shares of a security are trading when-issued at the start of its trading history

"Additional" - A security issues more shares that trade when-issued, during the same time that the established shares of the security continue to trade regular-way

"Ex-distribution" - A portion of a security's shares trade when-issued and without entitlement to a particular distribution, during the same time that the rest of its shares trade regular way, with the entitlement to the distribution

What do the characters in the CUSIP number represent? Why do some values in the CUSIP field begin with a letter? Can type of security or share class be determined using only CUSIP number?

The CUSIP number consists of nine characters. The first six characters uniquely identify the issuer and have been assigned to issuers in approximate alphabetical sequence. The seventh and eighth characters identify the issue. The ninth character is used as a check digit and is not stored in the CRSP US Stock Databases. The CRSP/Compustat Merged Database (CCM) displays all nine characters of CUSIP number for covered securities.

CINS (CUSIP International Numbering System) identifies securities issued outside of the US or Canada that trade internationally. ADRs are issued in the US, so they are not in this category. Values for CINS begin with a letter indicating country or region.

While the seventh and eighth characters of the CUSIP number are used to distinguish different issues of the same issuer, there is no absolute scheme for determining type of security or share class solely on their values.

Are there advantages to using CRSP's PERMNO instead of CUSIP number?

There are three situations that can cause missed links or mismatches when using CUSIPs from different sources:

1. CUSIPs can change over time for a security due to name changes or capital changes. If a database only contains the most recent CUSIP, or reassigns CUSIPs after trading stops, a backtesting universe identified by CUSIP will continue to drop links to the security data over time.

2. CUSIP allows a mechanism for third-parties to assign an unofficial CUSIP for a security otherwise unassigned. These CUSIPs contain a 9 in the 4th and 5th and/or 7th digit. If different third-parties select a different dummy CUSIP, the link between them can be missed or wrong. This is only an issue before 1968, before CUSIP existed, or in a few cases where foreign issues on US exchanges were assigned ISIN but never CUSIP.

3. CUSIP provides for the possibility of reusing a CUSIP, or equivalently, it may continue a CUSIP after a corporate event that could be considered significant enough to produce a new company or issue.

There is only one known case, PepsiAmericas in 2000 and 2001, when this occurred. CRSP's name history for a security tracks changes in its CUSIP number, accessible by the unique identifier, PERMNO. On CRSP, a PERMNO can be associated with more than one eight-digit CUSIP number over time, but an eight-digit CUSIP number can only be associated with one PERMNO. CRSP never changes or drops CUSIPs that were ever active, so backtesting universes identified by CUSIP always link to the correct data. In the third situation mentioned above, CRSP will assign a dummy CUSIP to the older security to preserve the uniqueness of CUSIP to PERMNO.

How are Shares Outstanding data collected and maintained?

CRSP primarily relies on vendors for shares data. Shares Outstanding values are reviewed by CRSP researchers when a security is added to our database. Subsequent shares are run through an extensive set of CRSP filters which look primarily for magnitude jumps and drops in share values and reversals of values. Depending on the source, a lag of one or two months may be seen between when changes in Shares Outstanding values occur and when they are reflected in our sources.

What are ADRs and how are the shares outstanding computed?

Companies incorporated in foreign countries and trading on foreign exchanges can be traded in the U.S. stock market as American Depositary Receipts (ADRs).

An investment bank will buy shares of a non-US-traded stock in a foreign market and issue ADR shares on the US market. A depository bank handles the issuance and cancellation of ADR certificates.

The depositary bank sets the ratio of US ADRs per home country share. This ratio can be less than or greater than 1 in order to get the ADR in a comparable trading range with US equities. CRSP does not have a variable for the ratio of underlying ordinary shares to ADR shares. The shares outstanding values for ADRs listed in the CRSP stock database are ADR shares traded in the U.S. and not the underlying common shares.

What are equity securities and are they all included in the CRSP database?

In general, any security that represents ownership interest in an entity is equity. All the securities listed on CRSP are equity securities, but not all equity securities are listed on CRSP.

The following sharecodes are included on the subscriber version of the CRSP database:

10s -  Ordinary Common Shares (domestic), Capital Stock, Global Registered Shares  20s -  Certificates, Americus Trusts (exclusively sharecode 23)  30s -  ADRs, ADSs (American Depository Receipts, American Depository Shares), New York         Registry Shares                                    40s -  SBIs (Shares of Beneficial Interest)  70s -  Units (Units of Beneficial Interest, Units of Limited Partnership Interest,         Royalty Trusts, Trust Units, Depository Units), ETFs (exlusively sharecode 73)  

The following sharecodes are maintained internally by CRSP, but are excluded from the subscriber version of the CRSP database:

50s - Warrants  60s - Rights  80s - Preferred Shares, Capital Income Securities (with interest rate)  90s - Bundled units, such as a common packaged with a warrant or right 

Are IPO data available?

Daily open prices are available for securities traded on NYSE, NYSE American, and NASDAQ exchanges beginning June 15, 1992. They represent the first trade after market opens. For NYSE, additional daily open prices are available between December 1925 and June 1962.

If a security went public on a CRSP-covered exchange under the circumstances described above, its IPO data should be available.

Why isn't AX (Archipelago Holdings Inc) covered in the CRSP Stock Database?

Archipelago Holdings is an unusual case. It has unlisted trading privileges on NYSE American, but its primary listing is on the Pacific Stock Exchange. Per CRSP convention, a security needs to have a primary listing on NYSE, NYSE American, or NASDAQ for coverage in the stock database.

Why are there Closing Bids that are greater than or equal to Closing Asks in the CRSP Stock Database?

This situation would be known as crossed quotations. Basically crossed quotations happens when the Ask price of one Market Maker is the same or lower that the Bid price of another Market Maker. Though it is an unusual condition, it does occur in fast-moving markets. Exchanges have end-of-day cutoff filters in place for certain situations, but this crossed- quotations event is not filtered, as it is deemed acceptable by FINRA (Financial Industry Regulatory Authority) . Because these cases of conflicting bid and ask values legitimately exist,  CRSP also retains these quotes and makes them available in our database. 

For any given point in time, why is the S&P 500 membership not equal to 500 in the CRSP Stock and Index databases?

CRSP receives S&P 500 security addition and deletion information directly from Standard and Poor’s. Sometimes a security will delist from the CRSP Stock database earlier than Standard and Poor’s deletion date.  In other cases, Standard and Poor’s addition date will include “when-issued” security data which CRSP does not include in the Stock database. Both of these cases will decrease the S&P 500 membership count in the CRSP database.

SEARS Holding Corp and Sears Hometown & Outlet Stores PERMCO Revision and Clarification.

In 201209, Sears Holding Corp had a rights distribution for Sears Hometown & Outlet Stores, Inc for all of its common shareholders. The rights offering was assigned the CRSP Security Identifier, PERMNO, 95601. NASDAQ assigned the issue identifier, Issuno, and org_id, Compno, 70617 and 60072337, respectively. The org_id assignment is a little inconsistent because the security had the same CUSIP Issuer Code as Sears Holding Corp but was being given a different org_id – 60072337 vs 60019149 and a different ticker. As a result, we assigned the right the same CRSP Company Identifier, PERMCO, as the parent company.

In 201210, the Sears Hometown & Outlet Stores, Inc common shares were issued to cover the rights that had been exercised. NASDAQ gave these shares an issue identifier of 71332 and org_id of 60072337. This is the same org_id it gave the right in the previous month even though they have different CUSIP Issuer Codes, but the same tickers. Because the new security had the same org_id as PERMNO 95601, it was given the same PERMCO.

Seen all together, it is clear that PERMNO 95601 should not have been given org_id 60072337 and PERMNO 13656 should not have the same PERMCO as PERMNO 89757. To fix the identifiers, we changed org_id for PERMNO 95601 to 60019149 from 60072337 (this is a non-subscriber permno) and changed the PERMCO for PERMNO 13656 to 54517 from 44118.

PERMNO CUSIP   PERMCO   Compno   Issuno EX  SIC Name  Dist Share Dlst Nasd

89757 81235010  44118 60019149    40191  3 5331    4     4    93    1    7

  Begdt - Enddt   HTick  DEL Latest Company Name

20030610-20130830 SHLD   100 SEARS HOLDINGS CORP

PERMNO CUSIP   PERMCO   Compno   Issuno EX  SIC Name  Dist Share Dlst Nasd

95601 81235011  44118 60072337    70617  3    0    1     0     0    0   10

  Begdt - Enddt   HTick  DEL Latest Company Name

20120912-20121002 SHOSR    0 SEARS HOLDINGS CORP

PERMNO CUSIP   PERMCO   Compno   Issuno EX  SIC Name  Dist Share Dlst Nasd

13656 81236210  44118 60072337    71332  3 9999    1     0    10    1    3

  Begdt - Enddt   HTick  DEL Latest Company Name

20121012-20130830 SHOS   100 SEARS HOMETOWN & OUTLET STRS INC

After:

PERMNO CUSIP   PERMCO   Compno   Issuno EX  SIC Name  Dist Share Dlst Nasd

89757 81235010  44118 60019149    40191  3 5331    4     4    93    1    7

  Begdt - Enddt   HTick  DEL Latest Company Name

20030610-20130830 SHLD   100 SEARS HOLDINGS CORP

PERMNO CUSIP   PERMCO   Compno   Issuno EX  SIC Name  Dist Share Dlst Nasd

95601 81235011  44118 60019149    70617  3    0    1     0     0    0   10

  Begdt - Enddt   HTick  DEL Latest Company Name

20120912-20121002 SHOSR    0 SEARS HOLDINGS CORP

PERMNO CUSIP   PERMCO   Compno   Issuno EX  SIC Name  Dist Share Dlst Nasd

13656 81236210  54517 60072337    71332  3 9999    1     0    10    1    3

  Begdt - Enddt   HTick  DEL Latest Company Name
20121012-20130830 SHOS   100 SEARS HOMETOWN & OUTLET STRS INC