Difference between revisions of "Help:DFM bots"
EricThrift (talk | contribs) (Added information about the update script on SharePoint, which will automate most of the tasks described here.) |
EricThrift (talk | contribs) m |
||
Line 8: | Line 8: | ||
Most of these scripts are version-controlled at https://github.com/DriedFishMatters/pywikibot-scripts. To use them: | Most of these scripts are version-controlled at https://github.com/DriedFishMatters/pywikibot-scripts. To use them: | ||
− | # Follow the [[mediawikiwiki:Manual:Pywikibot/Installation|instructions on the MediaWiki site]] to install Pywikibot on your computer. If it is a Windows machine, there is the option of using either the built-in Command Prompt or the Windows Subsystem for Linux (WSL); the linux command prompt is preferable because it makes installation of dependencies much easier. | + | # Follow the [[mediawikiwiki:Manual:Pywikibot/Installation|instructions on the MediaWiki site]] to install Pywikibot on your computer. If it is a Windows machine, there is the option of using either the built-in Command Prompt or the Windows Subsystem for Linux (WSL); the linux command prompt is preferable because it makes installation of dependencies much easier. NOTE: When setting this up, the URL for the DFM Wiki will be https://driedfishmatters.org/kb. |
# Configure a user account for the DFM Wiki on Pywikibot, again following the instructions on MediaWiki. | # Configure a user account for the DFM Wiki on Pywikibot, again following the instructions on MediaWiki. | ||
# Checkout or download the DFM scripts from GitHub to a <code>myscripts</code> directory inside <code>pywikibot2/scripts/userscripts</code>. | # Checkout or download the DFM scripts from GitHub to a <code>myscripts</code> directory inside <code>pywikibot2/scripts/userscripts</code>. |
Revision as of 13:30, 25 May 2022
We have been using a variety of "bot" scripts that interact with MediaWiki and other platforms.
The bots, which are generally triggered by a user command, facilitate repetitive tasks such as copying information from one system to another -- for instance, using the data from our Zotero library to generate a bibliography page in the wiki, or retrieving a page from the wiki and publishing it to the public-facing website.
Generally speaking, any task that is undertaken by a bot can also be performed manually. A bot can be viewed as an assistant to wiki editors, rather than an autonomous agent.
Installation
Most of these scripts are version-controlled at https://github.com/DriedFishMatters/pywikibot-scripts. To use them:
- Follow the instructions on the MediaWiki site to install Pywikibot on your computer. If it is a Windows machine, there is the option of using either the built-in Command Prompt or the Windows Subsystem for Linux (WSL); the linux command prompt is preferable because it makes installation of dependencies much easier. NOTE: When setting this up, the URL for the DFM Wiki will be https://driedfishmatters.org/kb.
- Configure a user account for the DFM Wiki on Pywikibot, again following the instructions on MediaWiki.
- Checkout or download the DFM scripts from GitHub to a
myscripts
directory insidepywikibot2/scripts/userscripts
. - From the
myscripts
directory, install any dependencies by running the commandpip install -r requirements.txt
- Add the following line to your
user-config.py:
user_script_paths = ['scripts.userscripts.myscripts']
Usage
Bots can be run from the pywikibot installation directory by issuing a command in the format:
python pwb.py [bot_name] [options]...
Arguments and options for each bot are listed in the README file for the git repository.
List of scripts
Script | Description |
---|---|
blog2wiki
|
Retrieves the RSS feed of recent blog posts and uploads them to the wiki. Used in generating the DFM Newsletter. |
compile_tables
|
Parses information from tables in pages in a given wiki category, where each table has two columns representing keys and values, and generates a new page with a table combining the data from all tables. For example, this was used in automatically generating E-book abstracts summary table out of the pages in Category:E-book synopsis. |
docx2wiki
|
Uploads the content of a Word document to the wiki. Reads an input Word document (docx); uploads new images from the document to the wiki, using an image fingerprint hash algorithm to compare images in the document to those already on the wiki and detect visually similar images (e.g., cropped or resized versions of the same image); converts Zotero citations to wiki templates; then uploads the document text to the wiki. This bot is used in converting Word manuscripts to documents and image sets managed on the DFM wiki, notably (but not exclusively) the Scoping reports. Some manual cleanup of the wiki text is generally required. Metadata for newly imported images will also need to be completed. Note that some images of stacks of dried fish are very similar, so they may be detected as duplicates even if they are slightly different. |
featured_pages
|
Locates the most recently added items in a featured category, and posts a page to the wiki listing those pages with their summaries. Pages are linked using the canonical URL, so the generated wiki page can be redistributed outside the wiki. This is used for the DFM Newsletter. |
ical2wiki
|
Reads an iCalendar file, accessible from a public URL, and updates a wiki page containing a listing of upcoming events from the calendar file. Used on the Calendar page. |
report
|
Downloads a page from the wiki along with high-resolution versions of all the images embedded in that page, retrieves data for any citations on that page that are linked to the Zotero group library, then saves the page as an html document containing front matter, standalone content, notes and bibliography, and table of contents. The resulting document can be used as input for generation of any of the output formats supported by Calibre, including EPUB and PDF. See examples on DFM Working Papers |
trello2wiki
|
Reads data from a Trello board and constructs a wiki page for each label on the board, listing the cards sharing that label and the status of each (taken from the card comments). |
trelloattachments
|
Downloads all the attachments from cards on a Trello board to the local computer, ignoring those that have previously been downloaded. Useful for backups or for redistributing Trello-managed attachments (like e-book chapter revisions) to people who are not using Trello. |
wiki2html
|
Creates a series of static html pages, plus an index with brief extracts, for wiki pages in a given category (in our case, Category:Public. These can then by synchronized with the web server using rsync . For example, see https:/driedfishmatters.org/pub/. Images are downloaded and links are adjusted so that the content can be distributed independently of the wiki, without links to internal content or editing functions.
|
zotero_bibliography
|
Downloads reference details from a collection in a Zotero library and generates a complete bibliographic listing of items in that collection, sorted by item type, with abstracts and links to downloadable attachments. This is used on the page Publications, which is exported to https://driedfishmatters.org/pub/publications.html. |
zotero_recently_added
|
Uploads a list of recent items from a Zotero collection to a wiki page. This is used to generate the list of items on the DFM Newsletter page, for example. |
zotero2wiki
|
Retrieves a list of items in a Zotero collection and uploads the details to a page on the wiki using Template:Report. This is used to generate DFM Working papers, for example. Note that the "Cover" metadata field must be placed in the "Extra" field as there is no built-in corresponding property in Zotero. |
Update script
See the UPDATE.sh script on SharePoint.
This script will:
- Make copies of all tasks from Trello (including archived tasks) to the wiki under Category:MEL
- Download copies of Trello attachments to a local folder (directory must be configured in the script)
- Upload events from the shared calendar to Calendar
- Upload a list of recent Zotero items to Recent research outputs
- Update the page Publications
- Update the page Featured pages
- Create a list of recent blog entries on the page Blog feed
- Export files in Category:Public to the DFM public website