« (en) MasterBot » : différence entre les versions
(Wikipastbot update) |
Aucun résumé des modifications |
||
Ligne 4 : | Ligne 4 : | ||
|'''English''' | |'''English''' | ||
|} | |} | ||
== Description == | == Description == | ||
Dernière version du 25 avril 2019 à 15:52
Language | Français | English |
Description
The MasterBot is responsible for managing and orchestrating wikipast bots. On the one hand, it allows to launch the bots individually, by entering parameters from a dashboard. On the other hand, it allows to personalize bot launch sequences, by specifying the parameters (i.e. frequency, launch, order, etc.) of each bot.
Start the bots individually
At first, the idea is to have the code of each bot on a specific page (or section of a page). This allows to launch the bots from the code available on the page of each (or even from Github). Then, just have a script to launch each bot (with the necessary parameters, eg the page to translate for the translatorBot). It should be able to run all this on a page "dashboard" on wikipast or with a simple web application.
The scripts
Each bot has its own script that takes care of the following tasks:
- Parser command line parameters
- Get the bot code (from wikipast or Github) (TODO)
- Launch the bot with the given parameters
The graphical interface
I used the Wooey [1] GUI which allows to launch Python scripts. I tested it all with TranslatorBot on the Lausanne and David Bowie pages. center | 450px|File: Wooey home.png | center | 450px center | 450px|File: Wooey done.png | center | 450px
Code
<Nowiki> import requests import re from bs4 import BeautifulSoup from googletrans import Translator
- the function takes a table of strings as an argument containing the names of the pages to translate
def translate (* names): user = 'testbot' passw = 'dhbot2017' baseurl = 'http: //wikipast.epfl.ch/wikipast/' summary = 'Wikipastbot update' translator = Translator ()
# this parameter is the target language in which we want to translate target_lang = 'en' target_language = 'English'
# login request payload = { 'action', 'query', 'format' 'json' 'utf8': , 'meta': 'tokens', 'type', 'login'} r1 = requests.post (baseurl + 'api.php', data = payload)
# login confirm login_token = r1.json () ['query'] [ 'token'] ['logintoken'] payload = { 'action', 'login', 'format' 'json' 'utf8': , 'lgname': user 'lgpassword': passw 'lgtoken': login_token} r2 = requests.post (baseurl + 'api.php', data = payload, cookies = r1.cookies)
# get edit token2 params3 =? format = json & action = query & meta = & continue = tokens' r3 = requests.get (baseurl + 'api.php' + params3, cookies = r2.cookies) edit_token = r3.json () ['query'] [ 'token'] ['csrftoken']
edit_cookie = r2.cookies.copy () edit_cookie.update (r3.cookies)
# we fetch the text we want to translate for name in names: result = requests.post (baseurl + 'api.php? action = query & titles =' + name + '& export & exportnowrap') soup = BeautifulSoup (result.text, "lxml") code = for primitive in soup.findAll ("text"): code + = primitive.string
# create names with english prefix en_name = "(" + target_lang + ") _" + translator.translate (name, src = 'en', dest = target_lang) .text
# add a table if it still does not exist if (code! = and code [0]! = '{' and code [0]! = '|'): code2 = '{| class = "wikitable" \ n | Language \ n | '+ "' English '\ n | " + target_language + " \ n |} \ n" + code payload2 = { 'action', 'edit', 'assert': 'user', 'format' 'json' 'utf8': , 'text': code2, 'summary': summary, 'title': name, 'token': edit_token} r5 = requests.post (baseurl + 'api.php' data = payload2, cookies = edit_cookie)
# save the links of sources that we will not translate sources = []
i=0 while i< len(code): if (code[i-1]! = '[' and code [j] == '[' and code [i + 1]! = '['):
j = i + 2 while (code [i:j+1]! = ']'): j + = 1 sources.append (code [i:j+1]) code = code.replace (code [last], "[k:last]", 1) i = j + 1 else: i + = 1
# translate the whole text by chunk of approx. 5000 characters. length = len (code) chain = punto = '.' k = 0 decrease = 1 last = k + 5000 while last <length: if code [k+5000-diminution] == punto: string + = translator.translate (code [k:k+5000-diminution], src = 'fr', dest = target_lang) .text k = last else: while code [last:length]! = punto: decrease + = 1 string + = translator.translate (code [i], src = 'fr', dest = target_lang) .text k = k + 5000-decrease + 1 decrease = 1 last + = 5000 last - = 5000 string + = translator.translate (code [' and translated_text[i+1], src = 'fr', dest = target_lang)if (translated_text [i+2] == '[j] ==' ['and translated_text [i+2:j]. isalpha ()): j = i while (translated_text [i]! = ']'): j + = 1 m = translated_text [i]