I have decided on a project to create a data base for myself as a project to figure out the ins and outs of this process. To do this I have decided upon a project that interests me both personally and professionally. So to start this process out I have made a rough outline of what the scope of this project will do and a simple flow chart to help me figure out steps. I have done a similar project in the past but this one will be done by working in file types that I have no previous experience in and in MsSql which I have very limited experience with.

This is my rough flow chart and the scope of my project is this: To create a code/ program in python that combs through api dataset comprised of all stocks in the market and filter them for a few things 1. Do they payout dividends(this keeps cash flowing without the sales of the stock) 2. Do they have a net income or revenue(seems simple why invest in a company that is losing money) 3. what exchange are they in (if they aren’t in an exchange that my broker allows me to trade in then why bother worrying about the stock) 4.do they have a surplus of cash/assets on hand in compared to debt(this is to make sure a company can survive a massive economic downturn or adapt quickly to a major market shift among other things(might consider getting rid of this requirement later, to put it bluntly more data is currently needed on business cases in the face of recessions) 5. Is the data up to date(very simple if the stock data is not up to date then it most likely isn’t relevant/ if it is more recent then the data stored within the file already then simply amend the file data of the specific stock) if the data passes these steps then it will be added or amended. This will be done in a while loop to make sure that it combs through the entire stock list available. The information I will save will be 1,stock symbol, 2. full stock name, 3.date, 4. the exchange its listed on, 5 industry, 6.sector, 7.net income, 8. cash on hand, 9. price, 10.dividend, 11.dividend as a percent of price(((div*dividend periods)/price)*100) The file type will be either a csv or excel(more data is needed by me to understand which is better for my purposes) It will be linked and be the basis for a mssql database as I will sort this data in mssql and might later add more relevant data to compare against such as gdp growth data from the U.S. Bureau of Economic Analysis by the U.S. Department of Commerce.