profile
viewpoint

Ask questions_get_response without headers doesn't work (at least with 'yahoo' source

to fix, I put in base.py:

def _get_response(self, url, params=None, headers=None):
    """ send raw HTTP request to get requests.Response from the specified url
    Parameters
    ----------
    url : str
        target URL
    params : dict or None
        parameters passed to the URL
    """

    # initial attempt + retry
    if headers == None:
        headers          = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}

    pause = self.pause
pydata/pandas-datareader

Answer questions kdvolder

I also noticed this broke and started digging around in the code (but I'm a total newbee to python so just groping around in the dark really).

Eventually, after failing to figure out why pandas datareader stopped working for yahoo, I wrote some code that uses a different url 'https://query1.finance.yahoo.com/v7/finance/download. This is the url that downloads data in csv format when you click the download link inside yahoo's pages. The url is easy to 'curl' without special headers or cookies or any stuff like that (which I would have no idea how to do anyways).That url seems to return similar data in convenient csv format. So it would seem much more convenient to use than the url datareader is currently using.

Anyways, just in case this is useful... (maybe someone can use it to figure out how to fix pandas datareader without special headers / cookies and sessions)... this is my amateurish code to read stock data from the https://query1.finance.yahoo.com/v7/finance/download url:

import pandas as pd
import datetime
import requests
import dateutil

baseUrl = 'https://query1.finance.yahoo.com/v7/finance/download'

def timestamp(dt):
    return round(datetime.datetime.timestamp(dt))

def get_csv_data(ticker='SPY', days=200) :
    endDate = datetime.datetime.today()
    startDate = endDate - datetime.timedelta(days=days)
    response = requests.get(baseUrl+"/"+urllib.parse.quote(ticker), stream=True, params = {
        'period1': timestamp(startDate),
        'period2': timestamp(endDate),
        'interval': '1d',
        'events': 'history',
        'includeAdjustedClose': 'true'
    })
    response.raise_for_status()
    return pd.read_csv(response.raw)

data = get_csv_data();
print(data)

Produces output like:

           Date        Open        High         Low       Close   Adj Close     Volume
0    2020-12-15  367.399994  369.589996  365.920013  369.589996  365.623657   63865300
1    2020-12-16  369.820007  371.160004  368.869995  370.170013  366.197449   58420500
2    2020-12-17  371.940002  372.459991  371.049988  372.239990  368.245209   64119500
3    2020-12-18  370.970001  371.149994  367.019989  369.179993  366.774872  136542300
4    2020-12-21  364.970001  378.459991  362.029999  367.859985  365.463440   96386700
..          ...         ...         ...         ...         ...         ...        ...
133  2021-06-28  427.170013  427.649994  425.890015  427.470001  427.470001   53090800
134  2021-06-29  427.880005  428.559998  427.130005  427.700012  427.700012   35970500
135  2021-06-30  427.209991  428.779999  427.179993  428.059998  428.059998   64827900
136  2021-07-01  428.869995  430.600006  428.799988  430.429993  430.429993   53365900
137  2021-07-02  428.869995  434.100006  430.521790  433.720001  433.720001   57697668

[138 rows x 7 columns]
useful!
source:https://uonfu.com/
answerer
Kris De Volder kdvolder SpringSource a Division of VMWare Vancouver, BC, Canada
Github User Rank List