Catavento

This is the main repository for the Catavento software. The following folders contain the projects used:

/ControllerCatavento: ->The NodeJS server that stores the UI, talk to Ravel and send the data to processing for rendering
/Docs -> General documentation of the project
/KioskApp -> Desktop Kiosk Application
/MicrophonePython -> The Python app that handles the microphone and the user audio recorded files
/LearningServer -> The Bots models implemented with a NLU Python dependency
/Processing_Catavento -> The main Processing UI visual application that runs on 4 monitors
/RavelCatavento -> Ravel Hub, Governance, Connectos, Wrappers, TOC, State services
/ScreenMirroring -> Client and Server Mirroring microservice (for the tutor session)
/RobotControl -> The Python app that handles the robot control (Eyes, Mounth, Head). Runs in the Raspberry Pi 3
/StartAndroidAPP -> The Android App (Java) that will start and shutdown the devices
/SQL -> Scripts to handle the data stored in the MySQL database
/Test -> Resources for testing the application (UI, etc)
/Voices -> Initial audio files with dialog for the voices (Watson_Isabela, NVDA's eSpeeak, Microsoft_Maria, Cereproc_Gabriel, Nuance_Felipe
/VoiceServer -> Python microservice to server the 'Daniel' Windows voice
/workspacesBots -> The JSON files of the bots

Design, experiments and other files

Files for the field research (desing fiction, datathon, voice recording) are stored in BOX in the subfolders of:

https://ibm.ent.box.com/folder/50333931736

Additional files for the UI design are stored in BOX at the following folder:

https://ibm.ent.box.com/folder/52753086573

Dependencies

The project use the following external dependencies:

The NodeJS server
Python 2.7 (for the TOC service) and Python 3.5 (for the Microphone, ScreenMirroring, Learning Server services and Desktop App)
MySQL and Redis
For the Processing the libraries: Minim, TimedEvents, 'HTTP Requests for Processing' and GifAnimation. Install this libraries in processing using the Sketch->Import Libraty...->Add library menu on the Processing IDE. The GifAnimation lib require the manual copy of the files in the /Libraires (Windows: C:\Users...\Documents\Processing\libraries, MacOS: /Users/ibm/Documents/Processing)
Processing also uses an external lib called 'Processing HTTP Server' available at https://diskordier.net/simpleHTTPServer/. Just create a /code folder in the sketch and copy the two .jar files (freemarker.jar and SimpleHTTPServer.jar). Also create a /data folder and insert a simple index.html file
For the generation of the audio files: SOX (http://sox.sourceforge.net/) and Lame (http://lame.sourceforge.net/download.php). Make sure to install the binaries (sox and lame) in /usr/local/bin
For the Volume control in MacOS: osascript and SwitchAudioSource (https://github.com/deweller/switchaudio-osx)
For the Volume control in Windows: SetVol (https://rlatour.com/setvol/)
Fot the Raspberrt Pi 3/4 we use the Raspbian distribuiton, NodeJS and Python. We also use the BerrytLan (http://www.berrylan.org/) APP + Service to setup de network connection without a mouse/keyboard/monitor
To turn on/turn off the TVs we use a linux package called cec-utils (sudo apt install cec-utils) in the Raspberry Pis
The Controller app has a package.json file with the dependecies. To install use npm install on the Controller folder
The Microphone, ScreenMirroring and LearningServer services and the Kiosk App have a requirements.txt file with the Python dependencies. To install, use "python3 -m pip install -r requirements.txt" on the LearningServer folder.
The system uses 3 external APIs of the IBM Cloud: Watson Assistant (WA), Speech-To-Text (STT) and Text-to-speech (TSS). Check the existings section in this READ for more details.
The script that starts the Controller: iniciaController_Catavento.sh
The script that starts all the Ravel Services+Microphone: \RavelCatavento\iniciaRavelCatavento.sh
The script that stop all the Ravel Services+Microphone: \RavelCatavento\ParaRavelCatavento.sh

Services and TCP Ports

Ravel Hub: 4000

Wrapper bot-a: 9060

Wrapper bot-b: 9061

Wrapper bot-c: 9062

TOC (Topic Classifier): 5000

Ravel States: 7000

ControlerCatavento: 80

Processing: 8000

Microphone Python: 6060

ScreenMirroring: 10000

LearningServer: 9090

MySQL: 3306

Redis: 6379

RobotControl: 7070

VoiceServer: 5555

Kiosk App: 1000

The Kiosk Desktop App

The Kiosk Desktop App is a python3 code that runs a browser in fullscreen. This is possible by the PyWebView dependecy (https://pywebview.flowrl.com/)
Change the kiosk_app.py file to setup de IP address of the Controller
To exit the Kiosk Desktop App with the Keyboard: -> In Ubunutu: just send ALT+F4 -> In WIndows: Hold ALT+TAB, choose the window of the browser and click in the 'X' of the upper right conner with the mouse

Monitor info for the testing environtment

Make sure to connect 4 monitors/TVs in the computer using the USB HUB and the USB <-> HDMI adapter
Make sure do NOT mirror de displays
The 3 vertical monitors should be rotated 90 degrees. The question monitor should be in horizontal setting without any rotation
The default resolution of the horizontal monitor is 1680 x 1050. The vertical monitors are 1050 x 1680
The correct order of the bots is:

Monitor 3=BotA

Monitor 2=BotB

Monitor 1=BotC

If the bot is not in the right monitor adjust the parameter "--display" in the lines 136-145 of the Processingcatavento.pde in processing

Audio info for the testing environtment

An additional USB HUB should be plugged with 3 audio adapters (dongles) + 1 USB Microphone. The audio adapters have the following names:

-> BotA: Logicool G430 Gaming Headset

-> BotB: USB PnP Sound Device

-> BotC: USB PnP Sound Device

The default computer audio output ("Built-in Output") should be connected with an external speaker.

Databases

Ravel uses ElasticSearch and the Microphone microservice uses MySQL. Check the /SQL folder for scripts to create the MySQL Database and tables. For the ElasticSearch, check the /doc folder for scripts on how to configure the ElasticSerach index.

Ravel also uses Redis to store keys. Check the /RavelCatavento/redis-cli for more info and a script (script_atualiza_redis.sh) that insert the necessary keys

A MySQL database is used by the LearningServer for storing elements from all bot corpora stored in Watson Assistant, and the paths to the trained models for each bot. For retrieving bot credentials, the LearningServer reads them from the same Redis database used by Ravel.

Voices

5 voices where recorded for the initial prototype. The recording was made using the online demos and the Audacity+Windows WASAP recording method. Check the /Voices folder. The following online demos for the voices were used:

Isabela_Watson: https://text-to-speech-demo.ng.bluemix.net/
NVDA eSpeek: https://eeejay.github.io/espeak/emscripten/espeak.html
Felipe Nuance: https://www.nuance.com/omni-channel-customer-engagement/voice-and-ivr/text-to-speech.html#! NOTE: The pt-br "Felipe" nuance voice is also availabe for MacOS users without a cost. Just install it using this information: https://www.tekrevue.com/tip/make-your-mac-talk-say-command/
Gabriel Cereproc: https://www.cereproc.com/support/live_demo
Maria Microsoft: https://developer.microsoft.com/en-us/microsoft-edge/testdrive/demos/speechsynthesis/
Daniel Microsoft: check inte narator app installed in the SO
Google Translate Femave Voice: install SOX + libmad (Linux: $sudo apt-get install sox libsox-fmt-mp3, Windows: https://www.videohelp.com/software/SoX). Use the GoogleSpeech python lib (https://github.com/desbma/GoogleSpeech). Check the /Voices/tts_Google for an example

To generate the audio using the text to speech access via Script:

Isabela Watson: check the tts_watson.js file in /Voices/tts_watson
To use the eSpeak voice, first you need to install the eSpeak engine:

$brew install espeak

Then use the epeak binary sending the parameters i.e.:

$espeak "Esta é uma voz em português" -v brazil -w teste.wav

To use the Mac OS ot-br "Felipe" voice, first install the pt-BR "Felipe" voice. Then use the say command:

$say "Este é um teste de voz" -v Felipe -o out

Note: the output file format is aiff, which needs to be converted to mp3 using the lame binary

Other brazilian portuguese voice alternatives

To use the Microsoft Daniel voice:

a) Enabled it by changing the Registry (dan.reg)

b) Test if it is installed:

PowerShell -Command "Add-Type -AssemblyName System.Speech; $speak = New-Object System.Speech.Synthesis.SpeechSynthesizer;$speak.GetInstalledVoices().VoiceInfo; $speak.Dispose(); "

c) Use the Power shell to generate a wav file:

PowerShell -Command "Add-Type -AssemblyName System.Speech; $speak = New-Object System.Speech.Synthesis.SpeechSynthesizer;$speak.SetOutputToWaveFile('test.wav'); $speak.SelectVoice('Microsoft Daniel'); $speak.Speak('Teste de voz'); $speak.Dispose() "

Google (only 1 pt-br female): https://cloud.google.com/text-to-speech/
OddCast (2 female and 1 male + avatar): https://www.ttsdemo.com/

IBM Text to Speech (TTS)

The IBM Cloud TSS is used as one of the Robot voices (Isabela voice for Robot A) when it speaks. It is implemented in the Voice microservice and there is some cache to avoid repeated calls. The TTS services used is called 'TTS_Catavento' and it is hosted in the IBM Cloud under the Standard plan for the RIS5-BRL accound.

Speech to Text (STT)

Watson Speech to Text Web Demo: https://speech-to-text-demo.ng.bluemix.net/
Google Web API Speech to Text Web Demo (requires Chrome): https://www.google.com/intl/en/chrome/demos/speech.html
The IBM Cloud STT is used when the user reads the question to check if what's being said is close to what is written on the screen in the Microphone micro-service. The STT services used is called 'STT_Catavento' and it is hosted in the IBM Cloud under the Standard plan for the RIS5-BRL accound. To access this service use the following URL:

Robot Anwsering Order

The order which the Robots anwser (first, second, third) depends on the topic of the anwser for a particular question. The topic is stored in the description of intent in the follow format: "<"topic">"|"<"subtopic">"

The following image shows the rules for order of the Robot anwser by each topic

Awnsering Order RUles

Confidence Values

Flowchart for the confidence selecting process

If it is a spontaneous questions (no training):

Use the table to check the order of the answers
Use the following percentages for the confidences:

First to respond: 80% TO 90% (BASE)

Second to respond: 40% TO 70% (BASE)

Third to respond: 10% TO 30% (BASE)

If the question is made after training use the table to check the order of the answer

Scenario 1: Only ONE question has trained.

If it is the FIRST answer:

 First to respond:      91% TO 99% (INCREASED)
  
 Second to respond:     40% TO 70% (EQUAL TO BASE)
 
 Third to respond:      10% TO 30% (EQUAL TO BASE)

If it is the SECOND answer:

 First to respond:     80% TO 90% (EQUAL TO BASE)
 
 Second to respond:    71% TO 79% (INCREASED)
 
 Third to respond:     10% TO 30% (EQUAL TO BASE)

If it is the THIRD answer:

First to respond:      80% TO 90% (EQUAL TO BASE)

Second to respond:     40% TO 70% (EQUAL TO BASE)

Third to respond:      10% TO 30% (INCREASED)

Scenario 2: TWO questions have been trained.

If FIRST and SECOND:

First to respond:      91% TO 99% (INCREASED)

Second to respond:     71% TO 79% (INCREASED)

Third to respond:      10% TO 30% (EQUAL TO BASE)

If FIRST and THIRD:

First to respond:      91% TO 99% (INCREASED)

Second to respond:     40% TO 70% (EQUAL TO BASE)

Third to respond:      31% TO 39% (INCREASED)

If SECOND and THIRD:

First to respond:      80% TO 90% (EQUAL TO BASE)

Second to respond:     71% TO 79% (INCREASED)

Third to respond:      31% TO 39% (INCREASED)

Scenario 3: THREE questions have been trained.

First to respond: 91% TO 99% (INCREASED)

Second to respond: 71% TO 79% (INCREASED)

Third to respond: 31% TO 39% (INCREASED)

StartAndroidAPP mobile APP

We used an Android (7.0+) APP called "StartAdroidAPP" to startup and shutdown the devices.

Check the diagram "Catavento_ArquiteturaStart_Shutdown.docx" in the /doc/AutomaticStart_Shutdown folder for the physical ethernet cable conection of the devices.

We use Wake Up on Lan in the computers (Windows + MacOS + RaspberryPi) and a Rele Arduino module to start/shutdow the devices.

A special entry point the ScreenSharing server is responsive to make the equipaments sleep (hibernate). The magic packet of the WOL wake them up.

iPad APP

We used an iPad APP called "Kiosk Mode for iPad". Check the following link:

https://itunes.apple.com/us/app/kiosk-mode-for-ipad/id986554705?mt=8

To avoid the user exiting the browser we used the nativel iOS Kiosk mode called "Guided Access". Check the following link for more information:

https://www.howtogeek.com/177366/how-to-lock-down-your-ipad-or-iphone-for-kids/

loginMonitor UI

To get access to the login page which is a restricted area for the guided interaction mode, the user should touch on the corresponding area BEM of the welcome title in index page.

Skills (WorkSpace) for the bots

The offical Skills workspace for the bots in the catavento is in the RIS5-BRL account. The name of the service is 'Conversations-Catavento', it was created in the standard plan whithin the 'default' resource group of IBM Cloud and the link to access it is:

NOTE: In the new Watson Assistant Plug (WAP) version (2021) it is necessary to create a linked assistant in order to used. For the 3 brazilian bots we created the linked assistants and generated an API key to access them with the API via the ibm-cloud nodejs SDK.

It contains the following bots:

Bot-A-SMART - PURPLE

Skill Name: Bot-A-SMART

linked assistant: catavento-bot-a

Bot-B-FUNNY - YELLOW

Skill Name:Bot-B-FUNNY

linked assistant: catavento-bot-b

Bot-C-PATRONIZING - GREEN

Skill Name:Bot-C-PATRONIZING

linked assistant: catavento-bot-c

SKILLS (WORKSPACE) FOR THE BOTS OF THE CSCW DEMO

Bot-A-Purple_CSCWDEMO:

Skill Name:Bot-A-Purple_CSCWDEMO

linked assistant: catavento-bota-CSCWDEMO

BotB-Yellow_CSCWDEMO

Skill Name:BotB-Yellow_CSCWDEMO

linked assistant: catavento-botb-CSCWDEMO

BotC-Green_CSCWDEMO

Skill Name: BotC-Green_CSCWDEMO

linked assistant: catavento-botc-CSCWDEMO

Bot_Catavento_vida

Skill Name:Bot_Catavento_vida

Bot_Catavento_vida_versao_multi

Skill Name:Bot_Catavento_vida_versao_multi

License

The IBM Catavento project is licensed under the Apache 2.0 license. Full license text is available at LICENSE.

mirrors_ibm/Catavento

Catavento

Design, experiments and other files

Dependencies

Services and TCP Ports

The Kiosk Desktop App

Monitor info for the testing environtment

Audio info for the testing environtment

Databases

Voices

IBM Text to Speech (TTS)

Speech to Text (STT)

Robot Anwsering Order

Confidence Values

StartAndroidAPP mobile APP

iPad APP

loginMonitor UI

Skills (WorkSpace) for the bots

Bot-A-SMART - PURPLE

Bot-B-FUNNY - YELLOW

Bot-C-PATRONIZING - GREEN

SKILLS (WORKSPACE) FOR THE BOTS OF THE CSCW DEMO

Bot-A-Purple_CSCWDEMO:

BotB-Yellow_CSCWDEMO

BotC-Green_CSCWDEMO

Bot_Catavento_vida

Bot_Catavento_vida_versao_multi

License

简介

发行版

贡献者

近期动态

mirrors_ibm/Catavento .gitee-modal { width: 500px !important; }

Catavento

Design, experiments and other files

Dependencies

Services and TCP Ports

The Kiosk Desktop App

Monitor info for the testing environtment

Audio info for the testing environtment

Databases

Voices

IBM Text to Speech (TTS)

Speech to Text (STT)

Robot Anwsering Order

Confidence Values

StartAndroidAPP mobile APP

iPad APP

loginMonitor UI

Skills (WorkSpace) for the bots

Bot-A-SMART - PURPLE

Bot-B-FUNNY - YELLOW

Bot-C-PATRONIZING - GREEN

SKILLS (WORKSPACE) FOR THE BOTS OF THE CSCW DEMO

Bot-A-Purple_CSCWDEMO:

BotB-Yellow_CSCWDEMO

BotC-Green_CSCWDEMO

Bot_Catavento_vida

Bot_Catavento_vida_versao_multi

License

简介

发行版

贡献者

近期动态

搜索帮助

mirrors_ibm/Catavento