WHARTON RESEARCH DATA SERVICES WRDS Linking Procedures Xuan Liu

22 Slides224.41 KB

WHARTON RESEARCH DATA SERVICES WRDS Linking Procedures Xuan Liu May, 2020

Topics Outline 1 Common Identifiers Used for Linking 2 WRDS Linking Procedures 3 Linking Resources on WRDS 2 Wharton Research Data Services

Typical Data Structure in Datasets on WRDS The identifier table: Permanent/ Primary identifier Company name Start date End date Example of permanent identifiers: CRSP – PERMCO PERMNO Compustat – GVKEY IBES – IBES ticker The data table: Permanent/ Primary identifier Date Item 1 Item 2 3 Wharton Research Data Services

Common Identifier Used for Linking - CUSIP CUSIP identifiers cover a wide range of global financial instruments, including extensive equity and debt issues, derivatives, syndicated loans, hedge funds, and listed options for the US and Canada. Definition Assigned by CUSIP Global Services (CGS) Format 9 characters(CUSIP-9) Recycled? No The structure of a CUSIP New offering Corporate actions (name change, merger, reorganization, chapter 11 or reverse stock split) Issuer Issue The check digit 4 Wharton Research Data Services

Common Identifier Used for Linking - SEDOL The stock exchange daily official list (SEDOL) is a seven-character unique, market-level, global security identification code with the coverage of all asset classes. Definition Assigned by The London Stock Exchange Group 7-character (a six-alphanumeric code a trailing check digit) Format Recycled? No Note : SEDOL codes are allocated sequentially and with no inherent meaning for each market an instrument is traded on. SEDOL Allocation Guidelines ( https://www.londonstockexchange.com/products-and-services/reference-data/sedol-master-file/sedol-allocatio n-guidelines.pdf ) An example: HSBC Holdings Plc (ISIN: GB0005405286) SEDOL London HK Bermuda Euronext Paris Frankfurt OTC 0540528 6158163 2367543 4097279 5722592 B00JZT0 5 Wharton Research Data Services

Common Identifier Used for Linking - ISIN Definition An International Securities Identification Number uniquely identifies a security. Assigned by The middle nine characters of the ISIN is administered by the relevant National Numbering Agency (NNA) in each country. 12-character alphanumeric code, first 2 characters for the issuing country 9 alphanumeric characters identifies the security 1 1 numerical check digit Format Recycled? No The structure of a ISIN Country code2 National Securities ID The check digit 1. CUSIP is used as the national securities ID for US and Canadian ISINs. For eg., The CUSIP for IBM is 459200101, its ISIN is US4592001014; For UK, SEDOL is part of the national securities ID. For eg., the SEDOL for BP PLC is 0798059, its ISIN is GB0007980591. 2. ISO 3166-1 alpha-2 code for the issuing country. 6 Wharton Research Data Services

Common Identifier Used for Linking - Ticker An abbreviation used to uniquely identify publicly traded shares of a particular stock on a particular stock market1 Definition Assigned by The stock exchange a security is traded on Format Alphanumeric code of various lengths Recycled? Yes Note: Tickers are recycled Ticker Ticker: V Start date End date Exchange Company name 2000/09/12 2006/08/03 NYSE VIVENDI 2008/03/19 2019/12/31 NYSE VISA INC CUSIP Exchange Ticker One company has two issues: Company name Date Alphabet Inc 20191231 02079K10 02079K30 NASDAQ GOOG GOOGL 1. If used for international data, need to check exchange information besides date. 7 Wharton Research Data Services

Common Identifier Used for Linking - CIK Definition Assigned by Format Recycled? The Central Index Key (CIK) is used on the SEC's computer systems to identify corporations and individual people who have filed disclosure with the SEC. The United States Securities and Exchange Commission (SEC) 10 digit number No 8 Wharton Research Data Services

Common Identifier Used for Linking – Company Names The challenge: Cname1 BENJAMIN FRANKLIN F S L A OR HOME & CITY SVGS BK ALBANY NY BLACKROCK BUILD AMERICA BOND TR BLACKROCK TAXABLE MUNI BOND TR GUGGENHEIM BLD AM BDS MGD DUR TR GUGGENHEIM TAXABLE MUN MGD DUR T GLOBAL EDUCATION & TECH GRP LTD T A L EDUCATION GROUP HOWARD HUGHES CORP BRIGGS MFG CO Cname2 BENJAMIN FRANKLIN FEDERAL SAVINGS & LOAN ASSN/OR HOME & CITY SAVINGS BANK/NY BLACKROCK TAXABLE MUNICIPAL BOND TRUST BLACKROCK TAXABLE MUNICIPAL BOND TRUST GUGGENHEIM TAXABLE MUNICIPAL MANAGED DURATION TRUST GUGGENHEIM TAXABLE MUNICIPAL MANAGED DURATION TRUST GLOBAL EDUCATION & TECHNOLOGY GROUP LTD TAL EDUCATION GROUP HOWARD HUGHES CORP (THE) BRIGGS MANUFACTURING CO SAS SPEDIS(Cname1, Cname2) 37 28 35 13 50 29 12 9 16 38 9 Wharton Research Data Services

Unconventional Linking Variables Company address Total asset Total sales Stock price Data Set 1 Data Set 2 10 Wharton Research Data Services

Permanent Identifiers , Company vs. Security Level, Header vs. Historical New shares issued Company name change IPO 2014040 3 Google Inc. Trading symbol: GOOGL GOOG Alphabet Inc. Trading symbol: GOOGL GOOG Google Inc. CUSIP: 38259P50 Google Inc. CUSIP: 38259P50 38259P70 Alphabet Inc. CUSIP: 02079K30 02079K10 Google Inc. CIK: 0001288776 Google Inc. CIK: 0001288776 Alphabet Inc. CIK: 0001652044 20040819 NASDAQ CUSIP Global Services (CGS) SEC CRSP Compustat As of 5/7/2020 Google Inc. Trading symbol: GOOG 20151005 Google Inc. PERMCO 45483 PERMNO 90319 NCUSIP 38259P50 TICKER GOOG Google Inc. PERMCO 45483 PERMNO 90319 NCUSIP 38259P50 TICKER GOOGL PERMNO 14542 TICKER GOOG NCUSIP 38259P70 Alphabet Inc. PERMCO 45483 PERMNO 90319 NCUSIP 02079K30 TICKER GOOGL PERMNO 14542 TICKER GOOG NCUSIP 02079K10 Alphabet Inc. GVKEY 160329 IID 01 TICKER GOOGL CUSIP 02079K305 IID 03 TICKER GOOG CUSIP 02079K107 Alphabet Inc. GVKEY 160329 IID 01 TICKER GOOGL CUSIP 02079K305 IID 03 TICKER GOOG CUSIP 02079K107 Alphabet Inc. GVKEY 160329 IID 01 TICKER GOOGL CUSIP 02079K305 IID 03 TICKER GOOG CUSIP 02079K107 11 Wharton Research Data Services

Topics Outline 1 Common Identifiers Used for Linking 2 WRDS Linking Procedures 3 Linking Resources on WRDS 12 Wharton Research Data Services

WRDS Linking Procedures Find/create the identifier (ID) table Examine the ID table Permanent identifiers? Common identifiers? Company or security level? Header or historical? Valid date range etc. Generate the linking table Use the linking table to merge data table and check the results 13 Wharton Research Data Services

Data Inconsistency 1. Information capture Effective date: Oct 5th, 2015 Google Inc. 38259P50 38259P70 Alphabet Inc. 02079K30 02079K10 2. Data item definition Eg. IBES CRSP Compustat (Snapshot) IBES TAQ 20151005 20151005 20151015 20151002 Company Name CUSIP BP PLC IBM EX079805 45920010 Definition: SEDOLs are used for non-US companies. 2-character country code first six digits of the official SEDOL 3. Data format Eg. BoardEx 4. Company name variations Eg. IBM Company name 1-800FLOWERS.COM INC First American Scientific Corp CIKCode 0001084869 1002822 Database Company name IBES INTL BUS MACH INTERNATION AL BUSINESS MACHINE CRSP INTERNATION AL BUSINESS MACHINES INTERNATION AL BUSINESS MACHS COR 14 Wharton Research Data Services

Linking Table with Permanent Identifiers Eg. Merge Institutional Shareholder Services (ISS) Governance with Compustat fundamental ISS rmgovernance table Compustat funda table COMPANY ID TICKER CUSIP MEETINGDATE CONAME GVKEY DATADATE TIC1 CUSIP1 543331 543331 543331 543331 543331 543331 543331 543331 543331 543331 543331 543331 543331 GOOG GOOG GOOG GOOG GOOG GOOG GOOG GOOGL GOOGL GOOGL GOOGL GOOGL GOOGL 38259P508 38259P508 38259P508 38259P508 38259P508 38259P508 38259P706 38259P508 02079K305 02079K305 02079K305 02079K305 02079K305 20070510 20080508 20090507 20100513 20110602 20120621 20130606 20140514 20150603 20160608 20170607 20180606 20190619 GOOGLE INC GOOGLE INC GOOGLE INC GOOGLE INC. GOOGLE INC. GOOGLE INC. GOOGLE INC. GOOGLE INC. ALPHABET INC. ALPHABET INC. ALPHABET INC. ALPHABET, INC. ALPHABET INC. 160329 160329 160329 160329 160329 160329 160329 160329 160329 160329 160329 160329 160329 20071231 20081231 20091231 20101231 20111231 20121231 20131231 20141231 20151231 20161231 20171231 20181231 20191231 GOOGL GOOGL GOOGL GOOGL GOOGL GOOGL GOOGL GOOGL GOOGL GOOGL GOOGL GOOGL GOOGL 02079K305 02079K305 02079K305 02079K305 02079K305 02079K305 02079K305 02079K305 02079K305 02079K305 02079K305 02079K305 02079K305 CONM AT ALPHABET INC 25335.806 ALPHABET INC 31767.575 ALPHABET INC 40496.778 ALPHABET INC 57851 ALPHABET INC 72574 ALPHABET INC 93798 ALPHABET INC 110920 ALPHABET INC 131133 ALPHABET INC 147461 ALPHABET INC 167497 ALPHABET INC 197295 ALPHABET INC 232792 ALPHABET INC 275909 The linking table COMPNAY ID GVKEY 54331 160329 1. the current primary issue of Alphabet inc. 15 Wharton Research Data Services

Generation of Linking Table Database 2 identifier table Database 1 identifier table Permanent identifier 2 Permanent identifier 1 CUSIP/SEDOL/ISIN/CIK Ticker Date1 Company name Unconventional variables Permanent ID1 Permanent ID2 Start date End date 1. Compare exchange information as well for international companies/securities. Score 16 Wharton Research Data Services

Suggested Identifier Tables for Linking Identifier table Permanent ID or primary ID CUSIP SEDOL ISIN security1 GVKEY X(9) X(7) X(12) company GVKEY g security1 GVKEY g company GVKEY Compustat Snapshot wrds cs names GVKEY X(9) CRSP stocknames PERMCO, PERMNO X(8) IBES id, idsum Ticker (IBES Ticker) X(8) X(6)3 TR Worldscope4 wrds ws ids ITEM6105 X(9) X(7) BoardEx - North America na wrds company names BoardEx - Europe eur wrds company names BoardEx - UK uk wrds company names BoardEx - Rest of World row wrds company names Database Compustat NA Compustat Global BoardID (Company ID) X(7) X(7) CIK Ticker Company name X X(10) X X(10) X X(12) X(12) X(10) 2 X(12) X X X X X X X X X X X 1. The Compustat security tables provide IBES tickers(ibtic). 2. In CRSP, item CUSIP is header CUSIP and item NCUSIP is historical CUSIP. 3. SEDOLs are used for non-US companies. The SEDOL field consists of a country code followed by the first six digits of the official SEDOL. 4. TR Worldscope provides IBES ticker(ITEM6038) in table wrds ws ids. ITEM6105 is the Permanent ID(C 3-character country code 4-character company ID 1 character company or security indicator). 17 Wharton Research Data Services

Suggested Identifier Tables for Linking Database Identifier table Permanent ID/ primary ID AuditAnalytics auditcblock company fkey OptionMetrics secnmd secid Bank Regulatory bankscode1 bhccode2 CUSIP X(6) TAQ Monthly mast YYYYMM symbol X(12)4 TAQ Daily mastm YYYYMMDD sym root sym suffix X(9) MSCI GMI Ratings names ticker, cik, cusip X(9) bond sym id X(9) Issue id, issuer id X(6,9) FISD fisd issue Ticker Company name X X X X X X redcode trace enhanced names CIK RSSD9001 (RSSD ID)3 redent trace names ISIN X(8) Markit Red Trace SEDOL X X X X X X X X X(7) X(12) X 1. ID variables for Commercial Bank 2. ID variables for Bank Holding Company. 3. Linking with CRSP is available on https://www.newyorkfed.org/research/banking research/datasets.html (RSSD ID, PERMCO, DT START, DT END) 4. CUSIP-9 plus extra three characters, which are used by the National Securities Clearing Corporation (NSCC) to identify the exchange. 18 Wharton Research Data Services

Topics Outline 1 Common Identifiers Used for Linking 2 WRDS Linking Procedures 3 Linking Resources on WRDS 19 Wharton Research Data Services

Linking Resources on WRDS Linking Suite by WRDS WRDS Research Macros WRDS Linking Resources Linking Table Contributed by Researchers Linking Table Provided by Data Vendors 20 Wharton Research Data Services

Linking Resources on WRDS TR insiders 1 Monthly TAQ Daily TAQ 1 2 1 FactSet 1 2 OptionMetrics 1 1 CRSP 1 1 6 1 2 Compustat 2 3 7 5 TR IBES 1 Trace TR Dealscan Lender Supply Chain With IDs (Compustat Segment) 8 CIK (WRDS SEC Analytics Suite) 1 4 TR S12 mutual fund holdings 1 TR Dealscan BoardEx Linking Suite by WRDS 2 WRDS Research Macros 3 CRSP/Compustat Merged by CRSP 4 MFLINKS by Russ Wermers (UMD) & WRDS Federal Judicial Center Litigation Company name 8 8 CUSIP 5 Compustat security/g security table 6 Roberts Dealscan-Compustat Linking Database by Michael R. Roberts et al. 7 DealScan Lender Link Tables by Schwert, Michael 8 WRDS SEC Analytics Suite 21 Wharton Research Data Services

T h a n k Yo u !

Back to top button