Reiter

The creation of workflows is part of the practical work of digital humanities. In doing so, they must not only observe best practices from computer science and data science, but also take into account the differences in the working methods of the two cultures. In this unit, you will learn the basics of the Orange tool in which you can visualise and execute parts of your workflow.

Orange - Introduction and installation

One tool that can support you in designing and documenting the technical part of your workflow is Orange. It is a user-friendly open source software for data mining and machine learning. It offers basic functions for...
Data import: Orange allows you to import data from various sources such as CSV files, Excel, databases, etc.
Data visualisation: Once imported, users can visualise the data to identify patterns, trends and outliers. Orange offers a variety of chart types and interactive visualisation tools.
Data processing: Before analysis, data can be pre-processed using various methods, e.g. by removing missing values, normalisation or transformation.
Modelling: Orange offers a wide range of algorithms for modelling, including decision trees, k-nearest neighbours, support vector machines and more. Users can easily apply and adapt these algorithms to their data.
Evaluation: Once models have been created, they can be evaluated using various metrics such as accuracy, precision and recall. Orange provides
When installing Orange (see below), Python and Miniconda are installed at the same time. Miniconda is a version of Anaconda, a software that manages python installations on your computer. Miniconda is necessary for the use of Orange because you can integrate and execute python code directly in Orange.
Windows:
  1. Go to the official Orange website: https://orange.biolab.si/download/
  2. Click on the "Download Orange" button for Windows.
  3. After downloading the installation file, run it.
  4. Follow the instructions of the installation wizard to install Orange on your system.
  5. After installation, you can start Orange from the start menu or the programme list.
Mac:
  1. Go to the official Orange website: https://orange.biolab.si/download/
  2. Click on the "Download Orange" button for Mac.
  3. After downloading the .dmg file, open it.
  4. Drag the Orange icon into the Applications folder.
  5. Search for Orange in the application folder and launch it from there.
Linux:
  1. Open the terminal.
  2. Update the package list: sudo apt-get update
  3. Install system packages:sudo apt-get install git python-pip python-virtualenv python3-dev python3-numpy python3-scipy python3-pyqt4 python-qt4-dev python3-sip-dev libqt4-dev
  4. Make a new orange enviroment: mkdir orange3envvirtualenv -p python3 --system-site-packages orange3envsource orange3env/bin/activate
  5. Install Orange (keep in mind this will take around 15 minutes): git clone https://github.com/biolab/orange3cd orange3pip install -r requirements.txtpip install -r requirements-gui.txtpip install -r requirements-sql.txtpython setup.py developcd ..git clone https://github.com/biolab/orange-biocd orange-biopython setup.py develop
  6. Open Orange with: python -m Orange.canvas
Conception and operationalisation / Conception / Orange - Basic functions
The following window opens when you start Orange.
You can use the three green buttons to create a new project in Orange, open an Orange project file or open a recently edited project.The orange buttons offer you different ways of getting to know the programme better. If you are interested in Orange, you are welcome to try out all these options.First click on "New" to familiarise yourself with the basic functions in this unit.
The Orange user interface consists of the menu bar, the widget catalogue and some options at the bottom left of the screen with which you can clearly design and annotate your workflow. The help icon (?) is already selected here by default to display supporting descriptions. You can expand and collapse the catalogue using the two arrows in the top right-hand edge of the widget catalogue.
Conception and operationalisation / Conception / Annotating orange workflow
Today we will learn about a simple workflow with which you can use the texts you already know from stylometry for topic modelling. To illustrate once again what a difference stopwords make for the automatic recognition of topics, I have prepared a workflow for you that works with the Text Mining Add On from Orange. You can install this by selecting Options >> Add ons in the menu bar.
In the installer window for add-ons, scroll down until you find the add-on "Text". Tick the box for the add-on and then click OK to install the add-on. Now you have to restart Orange. After the restart you should see the new category Text in your widget catalogue. Now you can import workflows that use widgets from the text add on.
You import this workflow by either selecting the "open" option in the start window or, if you have already closed your start window, via the menu bar File >> open.
Once you have loaded the file, your canvas should look like this:
Double-click on the individual widgets - for example "Preprocess Text" - to see what the individual widgets do. You can also select other options for stopwords in Preprocessing or change the number of topics in Topic Modelling. Then use the options in the bottom left-hand screen to annotate the workflow.
Click on the i-icon to insert a general description for the entire workflow. Use the #-icon to arrange your widgets in grid-like intervals. Use the T-icon to create text boxes for annotating the workflow, which you can place anywhere in the canvas. With the arrow icon, you can create arrows on your canvas with which you can point from text boxes to the widget to which the text refers. Use the pause icon to freeze your workflow - i.e. pause all calculations that are currently being performed. With the ?-icon you can switch the help texts that are displayed by default for the currently selected element on and off.
Task:
Annotate workflow: Add an annotation for each widget in the example workflow by describing what you think the widget does. You don't need to go into too much detail. Then take a screenshot and upload it.