Python: Jak pobrać wiele tweetów z użyciem funkcji search() w Twython

W mdoule Twython możesz użyć funkcji search() do szukania tweetów

import os
from twython import Twython

CONSUMER_KEY    = os.getenv('TWITTER_CONSUMER_KEY')
CONSUMER_SECRET = os.getenv('TWITTER_CONSUMER_SECRET')

# --- main ---

twitter = Twython(CONSUMER_KEY, CONSUMER_SECRET)

result = twitter.search(q='python', count=100)
tweets = result['statuses']

for number, item in enumerate(tweets, 1):
    print(number, '|', item['id'], '|', item['created_at'], '|', item['user']['name'], '|', item['text'][:20])

ale ta funkcja może zwrócić najwyżej 100 tweetów (a często i nawet mniej) poniewaz opcja count nie może użyć wartości większych niż 100. Jest to ograniczone przez oficjalne Twitter API. Musiałbyś wykonywać tą funkcję wiele razy z różnymi watościami opcji max_id= aby otrzymać coraz to starsze tweety.

Ale jest prostrza metoda. Możesz użyć funkcji cursor() do stworzenia generatora, który może być użyty z pętlą for do pobrania dowolnej ilości tweetów. Ta metoda wymaga nawet użycia break aby wyjść z pętli bo bez tego będzie ona wykonywana (prawie) bez końca.

import os
from twython import Twython

CONSUMER_KEY    = os.getenv('TWITTER_CONSUMER_KEY')
CONSUMER_SECRET = os.getenv('TWITTER_CONSUMER_SECRET')

# --- main ---

client = Twython(CONSUMER_KEY, CONSUMER_SECRET)

tweets = client.cursor(client.search, q='#python')

for number, item in enumerate(tweets, 1):
    print(number, '|', item['id'], '|', item['created_at'], '|', item['user']['name'], '|', item['text'][:20])

    if number >= 100:
        break

Aby zapisać dane w pliku musisz zbierać tweety na jakieś liści

import os
from twython import Twython
import json

CONSUMER_KEY    = os.getenv('TWITTER_CONSUMER_KEY')
CONSUMER_SECRET = os.getenv('TWITTER_CONSUMER_SECRET')

# --- main ---

client = Twython(CONSUMER_KEY, CONSUMER_SECRET)

tweets = client.cursor(client.search, q='#python')

data = []

for number, item in enumerate(tweets, 1):
    data.append(item)

    if number >= 100:
        break

print(data)

with open('output.json', 'w') as fh:
    json.dump(data, fh)

Możesz te dane użyć do stworzenia DataFrame

df = pd.DataFrame(data)

albo możesz użyć df.append(item, ignore_index=True) aby dodawać tweety do istniejącego DataFrame na bieżaco

import os
from twython import Twython
import pandas as pd

CONSUMER_KEY    = os.getenv('TWITTER_CONSUMER_KEY')
CONSUMER_SECRET = os.getenv('TWITTER_CONSUMER_SECRET')

# --- main ---

client = Twython(CONSUMER_KEY, CONSUMER_SECRET)

tweets = client.cursor(client.search, q='#python')

df = pd.DataFrame()

for number, item in enumerate(tweets, 1):
    #df = df.append(item, ignore_index=True)

    df = df.append({
                    'id': result['id'],
                    'user': result['user']['name'],
                    'text': result['text'],
                   }, ignore_index=True)

    if number >= 100:
        break

print(df)

df.to_csv('output.csv')

Notatki:

Dokumenatacja Twython: Search Generator

If you like it

Buy a Coffee