FactSet OnDemand Tips & Techniques Manual

Transcription

FactSet OnDemandTips & TechniquesManualFactSet Research SystemsVersion 3.3AFactSet Research Systems Inc.Copyright 2015 FactSet Research Systems Inc. All rights reserved.www.factset.com

Table of ContentsChapter 1. Introduction .31.1 Overview .31.2 OnDemand Q&A .31.2.1 OnDemand Interface Questions .31.2.2 Data Request Questions .4Chapter 2. Extract Data from FactSet .52.1 FactSet Syntax.52.1.1 FactSet Query Language (FQL) – Time Series Access .52.1.2 Screening Language – Cross Sectional Access .52.2 OnDemand Requests and FactSet Syntax.62.3 OnDemand Factlet Requests .62.3.1 Standard Factlets .62.3.2 Specialized Factlets .72.3.3 Other FactSet Functions .92.4 The “Ideal” Data Structure . 10Chapter 3. Speed of OnDemand Data Requests . 113.1 Factlet Request . 113.1.1 OnDemand factlet request compared to Excel Data Downloading .113.2 Factors Affecting Speed of OnDemand Requests . 123.2.1 System Factors on Client’s PC .123.2.2 Fetching and Parsing of Data .123.2.3 ExtractFormulaHistory and ExtractDataSnapshot .123.3 Tips to make the requests more efficient: . 143.3.1 Time-series and static data .143.3.2 Time-series data of different frequencies .14Chapter 4. FactSet OnDemand Glossary . 15Appendix . 18FactSet Consulting Services . 20FactSet Research Systems Inc.Copyright 2015 FactSet Research Systems Inc. All rights reserved.2 P a g ewww.factset.com

Chapter 1. Introduction1.1 OverviewFactSet supports the major tasks frequently undertaken in a data mining project, such as dataacquisition, data preparation, modeling, and model execution with reporting/graphing. The focus ofthis manual is to provide an overview of tips and techniques for improving the performance ofFactSet OnDemand requests in statistical packages.1.2 OnDemand Q&AThe following are some of the most common troubleshooting questions to consider when installingthe OnDemand Interface and setting up data requests.1.2.1 OnDemand Interface Questions Are you installing the FactSet OnDemand Interface for MATLAB, R or developer’s toolkit?-For MATLAB you need MATLAB version 2011b or above and a subscription to theMATALAB DataFeed toolbox.-For R you need R version 2.1.0 or above.-What language and version are you using for the developer’s toolkit installation1?Are you installing the plugin in the correct directory?- Is the version of the plugin matching the software version you are using?- For MATLAB and R you will need to select the root directory of the version of the statpackage you are using. For example, you cannot install to the MATLAB directory, insteadyou should install to the MATLAB/R2014A directory.If you use a 32 bit software version you should always use the 32 bit plugin version(regardless if your Windows architecture is 32 bit or 64 bit)Do you have read and write access to the temp file where the data is first downloaded to?-In some cases (such as running through a server) you might need to change thedirectory for the temp file by 6), newDirectory) Are you using the correct credentials for OnDemand requests?-OnDemand credentials are different from the FactSet Workstation username andpassword. If you are getting an error message stating Invalid credentials, verify thatOnDemand credentials are being used.1Please contact you FactSet representative for further details on supported versions for the differentdeveloper’s toolkit languages.FactSet Research Systems Inc.Copyright 2015 FactSet Research Systems Inc. All rights reserved.3 P a g ewww.factset.com

Are you making requests using a pre-release version of the statistical package (such asMATLAB and R)?- Are you using your company specific proxy settings?- Pre-release versions of MATLAB and R are not officially supported by the OnDemandPlugin. Check with FactSet support.In the main configuration dialog box, the user has the option to input their relevantproxy information.Does your firewall allow web calls to the FactSet servers?-Ensure to whitelist https://*.factset.com on firewalls / URL inspectors / IPS / IDS;Ensure that https:// (TCP 443) is allowed to FactSet's production destination subnets 192.234.235.0 (255.255.255.0) and 64.209.89.0 (255.255.255.0)1.2.2 Data Request Questions Are you making a time series data request or as of a single date? Are you using the correctfunction or factlet?- Are you using the date parameter to specify the date range for your request?- For efficiency reasons there are limitations in number of datapoints requested in FactSetOnDemand. The timeout for requests is set to a maximum of 15 minutes. For anexample of how to break up a request, refer to the scripts in Appendix 1.Are you downloading Fixed Income data with a screening factlet?- The default calendar is set to US, to change please use the calendar parameter.Is your request too large? How many data items are you trying to download in the request?- Always include the date parameter in your factlet, this will ensure that the dates arelined up between the code used and the output date in the response.Are you using the calendar setting?- To determine the most effective FactSet data access method and factlet to use, refer toChapter 2.The universe type needs to be specified as debt for fixed income data in screening.Do you get an error when running the Factlet? What is the error message? Is it the sameerror for all factlets or do you get the expected data for other requests?- This Information is important from a troubleshooting perspective.FactSet Research Systems Inc.Copyright 2015 FactSet Research Systems Inc. All rights reserved.4 P a g ewww.factset.com

Chapter 2. Extract Data from FactSet2.1 FactSet SyntaxFactSet stores all of the available data in proprietary database structures on FactSet computers. Thisallows FactSet to adjust the way data is stored, so that clients can access data as efficiently aspossible. Most datasets available on FactSet are stored in two different ways, so as to facilitate twodifferent data access methods. These two options use the FactSet Query Language (FQL) for timeseries requests and the FactSet Screening Language (Screening) to efficiently extract data for a largeuniverse of securities as of a single date. Please see Online Assistant page 10410 for furtherinformation on the storage methods.2.1.1 FactSet Query Language (FQL) – Time Series AccessThe extraction of a time-series of data from FactSet either for an index or security (e.g. the last 4quarters of EPS for the S&P 500 or for Exxon) or macroeconomic data (e.g. the last 5 years ofindustrial production for the BRIC countries) can be done using the FactSet Query Language (FQL).FQL is a proprietary data retrieval language used to access FactSet data. For more information onFQL, see the FactSet Online Assistant page 1961.To request a time-series of data; a start date, end date and frequency needs to be specified. If a dateis not specified, data is returned from the most recent time period. The dates can be either absolutedates or relative dates.Some advantages of FQL include:The ability to specify dates for any database using the same formats.With FQL, date formats are flexible. You can use a number of consistent date formats(defined by FQL) for all databases which makes using and combining data from differentdatabases simple.The ability to iterate items, formulas, and functions at any frequency.With FQL, you can iterate items, formulas, and functions at any frequency. For example, youcan request a series of weekly price to earnings ratios.2.1.2 Screening Language – Cross Sectional AccessAlternatively, the extraction of data for a list of ids for 1 date, both for equity and fixed incomesecurities is best done using the Screening Language. The FactSet Screening Language is a way toefficiently extract data for a large universe of securities as of a single date.By default, the Screening Language does not allow iteration and therefore cannot be used to returna time series of data with a single request code. To request data as of a single historical date, anabsolute or relative date can be specified.Overall, FQL syntax should be used to retrieve data for many data items, or todownload time series data. Screening syntax should be used to retrieve data for alarge universe for a single point in time.FactSet Research Systems Inc.Copyright 2015 FactSet Research Systems Inc. All rights reserved.5 P a g ewww.factset.com

2.2 OnDemand Requests and FactSet SyntaxFactSet OnDemand data requests are done via web requests that retrieve a bundle of FactSet datain a variety of tagged or delimited formats. The OnDemand integration generates a URL that createsan https request. The data request will be transmitted over the internet to FactSet DataDirectServers. DataDirect uses HTTP basic authentication over Secure Sockets Layer (SSL). The DataDirectservers handle authentication of the user and the permissioning of data sets. The OnDemandintegration is designed to provide simple access to FactSet data in reasonably sized blocks through aweb service.FactSet has written a number of functions for the FactSet OnDemand plugins that retrieve FactSetdata using FQL or Screening syntax by calling stored procedures (factlets) on the FactSet OnDemandservers. Factlets are server-based functions that encapsulate business logic and data collectionprocedures. A factlet can handle a simple data request or can invoke complex application logic. Thedata items that are requested, along with an identifier and date, are stored in a structure that isreturned. Entities (securities, indices, etc.) and the time dimension are stored in arrays, eitherseparately or combined together into one.2.3 OnDemand Factlet RequestsThe following is a list of the factlets available using OnDemand MATLAB, R, Developer’s Toolkit andSAS integrations. Not all factlets are available in all integrations. The description for each factlet alsohighlights if the factlet should be used with FQL or Screening syntax.The factlets should be chosen depending on the dataset required. There are general factlets usingeither actual screening or FQL codes as input (to find the correct code please use the FactSet Sidebarlook-up dialog) and specialized factlets used for specific datasets.2.3.1 Standard FactletsThe Standard Factlets below are used for Screening data, Economics data and FQL data. For theexact input syntax, the FactSet Sidebar dialog box can be used.FactletFactSet syntaxused by factletExtractDataSnapshotScreeningFunction is used for extracting one or more items for a list of ids for 1 date, both forequity or fixed income securities. Should be used to efficiently extract data for a largeuniverse of securities as of a single date.The data can also be retrieved using a backtest date to avoid look-ahead bias in theanalysis. The backtest functionality is available to clients subscribing to one of FactSet’squantitative applications, such as Alpha Testing or Portfolio Simulation.ExtractEconDataFQLFunction provides access to a broad array of macroeconomic content, interest rates andyields, country indices and various exchange rate measures from both the FactSetEconomics and the Standardized Economic databases.FactSet Research Systems Inc.Copyright 2015 FactSet Research Systems Inc. All rights reserved.6 P a g ewww.factset.com

ExtractFormulaHistoryFQLFunction is used for extracting one or more items for one security, an index or a list ofsecurities over time.2.3.2 Specialized FactletsThe specialized Factlets are developed for different content sets or specialized data structures.These factlets have been developed to simplify and standardize the data retrieval of more complexdata structures.FactletFactSet syntaxused by factletCorporateActionsDividendsFQLFunction is used for extracting stock dividend information.CorporateActionsSplitsFQLFunction is used for extracting stock split information.EstimatesOnDemandFQLFunction provides access to FactSet sourced company level estimates data. The data isaccessed through the following reports that are available with this function: ance,Surprise,DetailedRecommendations and Consensus tion provides access to data from AlphaTesting model results. Alpha Testing is atool available in the FactSet workstation used to assess the relationship between one ormore variables and subsequent returns over time. A subscription to Alpha Testing inFactSet is necessary to extract this data in the stat packages.ExtractBenchmarkDetailScreeningFunction is used for extracting multiple data items for a benchmark. Benchmark datacan be retrieved using other functions, such as with ExtractFormulaHistory, but theExtractBenchmarkDetail function allows a user to retrieve a more comprehensiveoverview of the index constituent data, without additional codes or calculations. In thedefault output, identifiers are sorted in descending order by weight in the index andeach row shows the index id, company id, date, ticker, and weight. Additional items aredisplayed at the end.Note: The ExtractBenchmarkDetailfunction by defaultusesScreeningcodes entered intheitemsargument of thesyntax. If using anFQL code, enter anbefore the FQLitems code.FactSet Research Systems Inc.Copyright 2015 FactSet Research Systems Inc. All rights reserved.7 P a g ewww.factset.com

ExtractOFDBItemScreeningFunction provides access to a list of securities and multiple data items for a range ofdates uploaded into a single Open FactSet Database (OFDB).Note: The ExtractOFDBItem functionby default usesScreening.FQLshould be usedwhen using idswith spaces orshortpositions,indicated in theOFDB with an S.ExtractOFDBUniverseFQLFunction provides access to a list of securities belonging to a single Open FactSetDatabase (OFDB) file as of a single date.ExtractScreenUniverseScreeningFunction used for extracting a list of Identifiers stored in a single FactSet screen. In theFactSet workstation, a user can screen for securities based on specified criteria andstore the result using FactSet Universal Screening for equity or debt securities.ExtractOptionsSnapshotFQLFunction is used for extracting options data for one or more conditions from theFactSet-Options Derived Values database.ExtractSPARDataFQLFunction is used for displaying SPAR data for specified funds from databases thatincludes S&P, Lipper, Morningstar, Russell, eVestment, Nelson, Rogerscasey, and PSN. Asubscription to SPAR in FactSet is necessary to be able to extract this data in ormula function is used for extracting FactSet data that is stored in avector data format, where the data array does not have a predefined size and isorganized by the vector position. A vector can be thought of as a list that has onedimension, a row of data. A vector position allows for a particular element of the arrayto be accessed.ExtractVectorFormula handles non-sequential data with support for matrix or vectoroutput. The nature of the data determines if the output is a matrix or vector, it is notspecified in the function to choose which format the data is returned in. This type ofdata includes corresponding geographic or product segment breakdowns for a companyor detailed broker snapshot or history estimates/analyst information.FactSet Research Systems Inc.Copyright 2015 FactSet Research Systems Inc. All rights reserved.8 P a g ewww.factset.com

LSD OwnershipFQLFactSet Ownership database collects global equity ownership data for institutions,mutual fund portfolios, and insiders/stake holders. Detailed ownership data can beextracted by company or by holder (institution, mutual fund, and insider/stake). TheLSD Ownership function is used for extracting one or more data items from the FactSetOwnership database for one or multiple securities or holders.2.3.3 Other FactSet FunctionsThe Standard Factlets below are used for Screening data, Economics data and FQL data. For theexact input syntax, the FactSet Sidebar dialog box can be used.FactletFactSet syntaxused by factletTickHistoryFQLTickHistory is used for extracting real-time trading details for a specific security. Thedata comes from FactSet’s Time and Sales database, which provides history of quotesand trades for a trailing 60 days, or up to 1 year with an additional subscription.Streaming Realtime Exchange DataFQLRequires an additional subscription and FactSet plugin version 3.0 for MATLAB and3.1 for R/Developer’s ToolkitThe realtime function is used to stream realtime exchange data and will update witheach trade.DocumentsFQLRequires an additional subscription and FactSet plugin version 3.1 The Documents service provides access for the retrieval of news stories, investmentresearch reports, filings, and transcripts. When requested, summary information of thedocuments will be returned, including an http URL to access the resulting documents.SnapshotFQLRequires an additional subscription and FactSet plugin version 3.1 The Snapshot service provides access to streaming exchange data and allows the“snap” of real-time prices at the user’s request. When the command is run, a list of allavailable exchange data items will be returned by default.F.Cancel()Requires FactSet plugin version 3.2 The F.Cancel() function allows the user to manually cancel all requests on the backend,which will allow for another request to be made.FactSet Research Systems Inc.Copyright 2015 FactSet Research Systems Inc. All rights reserved.9 P a g ewww.factset.com

Please consider Chapter 2.1 regarding FQL vs Screening when selecting anindividual factlet to retrieve data.2.4 The “Ideal” Data StructureThere is no “ideal” data structure, but there are some general guidelines that will help the user.When making OnDemand data requests using a factlet, it is important to consider memorylimitation of any statistical package storing data. FactSet defines one data request as a request thatgenerates a response limited to 1 million data points. A large data request that is beyond the 1million data points would need to be broken down into smaller pieces in order to move it and thenstore it permanently. Requests that are too large take too long to satisfy and performance suffer.The Scripts in the Appendix provides an example of breaking down a request, with comments addedto explain each line of the script. The OnDemand MATLAB User Guide is available in Online Assistantpage 15262 and the OnDemand R User Guide is available in Online Assistant page 15239 for furtherexamples.FactSet Research Systems Inc.Copyright 2015 FactSet Research Systems Inc. All rights reserved.10 P a g ewww.factset.com

Chapter 3. Speed of OnDemand Data RequestsThis chapter provides a breakdown of the steps involved in making an OnDemand data requestusing factlets. Each step is described in delivering the FactSet data to the statistical package andfactors that may be affecting the speed of those requests.3.1 Factlet RequestThe OnDemand factlet data requests are made via theInternet. When a request is created from theclient’s application, using the FactSet DataDirect technology a URL call is made, which goes to theFactSet OnDemand API.The client’s request is sent over the Internet as an HTTPS request to the FactSet OnDemand servers.Then the request is sent to the FactSet databases, which include the wide range of data, such asFundamentals, Estimates, Ownership, Economics, or Ownership. The fetched data is then returnedand parsed along the same path to the client’s application.The following are the steps involved in delivering data from FactSet, when making an OnDemandrequest in a statistical package:1. Local function in the statistical package takes the request and forms a URL POST requestupload document.2. The HTTPS POST request is made by the FactSet API (Kratos.dll) to the OnDemand servers.3. The FactSet OnDemand servers analyze the request and collects data from variousdatabases.4. The data returned in step 3 is analyzed to ensure that the dates and identifiers are alignedacross the various datasets downloaded.5. The response to the POST request is downloaded to a local stat package function. It takes aresponse document and parses it into a native statistical package data structure.3.1.1 OnDemand factlet request compared to Excel Data DownloadingThe OnDemand factlet request and Excel Data Downloading take very different approaches butfundamentally rely on the same FQL engine to retrieve data. It is important to note that theOnDemand plugin was developed to simplify data downloading directly into a third party applicationand may have speed differences compared to Excel data downloading.Excel Data Downloading lets you create customized spreadsheet reports that displayinformation from databases on FactSet. The template is a spreadsheet in which youspecify companies or securities, request the exact information from FactSet that youwant displayed for the companies.FactSet Research Systems Inc.Copyright 2015 FactSet Research Systems Inc. All rights reserved.11 P a g ewww.factset.com

OnDemand Factlet Request Excel Data DownloadingCalled programmatically by calling afunction with arguments.Interact with statistical package tomake data requests.Factlet reports are used for retrievingFactSet data.Uses internet to hit the DataDirectservers.Retrieved data parsed into statpackage data structure. Called by entering formula into anExcel template.Template is used to create a reportthat includes all the data andformatting you requested or specifiedin your template.Data delivered as an XLS file.Fetches data from mainframe usinginternet or WAN)Please note the additional step of parsing the data into a statistical package friendly format cancreate differences in retrieval times between the two methods.3.2 Factors Affecting Speed of OnDemand RequestsThere are a number of factors that affect the speed of retrieving OnDemand data requests.3.2.1 System Factors on Client’s PCComponents within your computer, such as system load, internet bandwidth, application memoryand computer configuration variables are some of the system factors that affect the speed ofretrieving and parsing OnDemand data requests.3.2.2 Fetching and Parsing of DataDuring the factlet request process, the primary allocation of time for the data transfer takes place intwo operations: fetch (data collection and alignment by FactSet server) and parsing into native datastructure (in stat package on local PC). System load on the FactSet servers is determined by thenumber of users and time of day, and can have a significant effect on the time it takes to fetch data.The following examples provide a breakdown of the difference in time using different combinationsof splitting up a sample request.3.2.3 ExtractFormulaHistory and ExtractDataSnapshotIn this example using the ExtractFormulaHistory factlet; price and trading volume for theconstituents of STOXX 600 over the last three years are being extracted.F.ExtractFormulaHistory(ids,'p price,p volume day','0:-3y:d')The ids variable is the constituents of STOXX 600 split up in different numbers of iterations, where 1iteration means that all 600 ids are requested at once, 4 means that 150 ids are requested in eachcall etc.FactSet Research Systems Inc.Copyright 2015 FactSet Research Systems Inc. All rights reserved.12 P a g ewww.factset.com

Number of Iterations14610Total Time (s)2123.09358116.99985112.32798122.26546As seen in the table it is more efficient, to a certain point, to split up a request rather thanrequesting all data in one call. The optimal number of iterations is dependent on number of items,length of history and other factlet specific parameters.The same trend can be noticed for ExtractDataSnapshot requests split up on dates.In the below example using ExtractDataSnapshot the constituents of STOXX and their price, volumeand market value is requested over the last 18 months on a daily frequency.F.ExtractDataSnapshot(ids,'p price,p volume day','sd:ed:frq')Iterations in this case is made on dates, 1 iteration means the full 18 months of data in one request,2 iterations means 9 months data in one call etc.Number of Iterations1236Total Time (s)323.549313.377309.340314.215For large requests it will be more efficient to split the request in multiple smallercalls. For ExtractFormulaHistory it is best to iterate over IDs. ForExtractDataSnapshot it is more efficient to iterate over dates.As noted above, there are no hard and fast rules, so once you start to work on the efficiency of youapplication, its best to try a few approaches and map the results to see which works the best.2Based on average numbers13 Copyright FactSet Research Systems Inc. All rights reserved

3.3 Tips to make the requests more efficient:3.3.1 Time-series and static dataIf both time-series data (such as price) and static data (such as company name) are needed, considersplitting up the request to not request static data as a time-series.In the below example price, volume and company name is requested for 3 years on a daily basis. Ifthe static filed is broken out in a separate call the combined time will be quicker. This gets moreimportant the more history is requested.F.ExtractFormulaHistory(ids,'p name,p price,p volume day','0:-3ay:d')The above example takes longer than the two requests below combined where the static name fieldis broken out in a separate request.F.ExtractFormulaHistory(ids,'p price,p volume day','0:-3ay:d')F.ExtractFormulaHistory(ids,'p name','0')3.3.2 Time-series data of different frequenciesIf the requested data is of mixed frequencies it will be more efficient to break up the call to use onerequest per frequency:F.ExtractFormulaHistory(ids,'p price,ff sales,ff eps','0:-3ay:d')Instead of requesting all data items on daily basis as above it is more efficient to divide the items indifferent requests depending on the items’ frequency as in the below examples.F.ExtractFormulaHistory(ids,'p price','0:-3ay:d')F.ExtractFormulaHistory(ids,'ff sales,ff eps','0:-3ay:y')14 Copyright FactSet Research Systems Inc. All rights reserved

Chapter 4. FactSet OnDemand GlossaryFactletFactlets are server-based functions that encapsulate business logic and datacollection procedures. A factlet can handle a simple data request or caninvoke complex application logic.Request ResultThe data items that are requested, along with an identifier and date, arestored in a structure that is returned. Entities (securities, indices, etc.) andthe time dimension are stored in arrays, either separately or combinedtogether into one.Interface FunctionA function in the stat package that takes the parameters specified by theclient makes the call to the factlet and returns the results as a data object.FactSet OnDemandFactSet OnDemand provides synchronous access to FactSet factlets via thestandard HTTPS protocol. Data can be returned in a number of formats,such as XML or CSV file. There are many reports and services available. Auser can make custom requests by changing the request URL to contain the

FactSet stores all of the available data in proprietary database structures on FactSet computers. This allows FactSet to adjust the way data is stored, so that clients can access data as efficiently as possible. Most datasets available on FactSet are stored in two different ways, so as to facilitate two different data access methods.