This is the main repository for the Catavento software. The following folders contain the projects used:
Files for the field research (desing fiction, datathon, voice recording) are stored in BOX in the subfolders of:
https://ibm.ent.box.com/folder/50333931736
Additional files for the UI design are stored in BOX at the following folder:
https://ibm.ent.box.com/folder/52753086573
The project use the following external dependencies:
The NodeJS server
Python 2.7 (for the TOC service) and Python 3.5 (for the Microphone, ScreenMirroring, Learning Server services and Desktop App)
MySQL and Redis
For the Processing the libraries: Minim, TimedEvents, 'HTTP Requests for Processing' and GifAnimation. Install this libraries in processing using the Sketch->Import Libraty...->Add library menu on the Processing IDE. The GifAnimation lib require the manual copy of the files in the /Libraires (Windows: C:\Users...\Documents\Processing\libraries, MacOS: /Users/ibm/Documents/Processing)
Processing also uses an external lib called 'Processing HTTP Server' available at https://diskordier.net/simpleHTTPServer/. Just create a /code folder in the sketch and copy the two .jar files (freemarker.jar and SimpleHTTPServer.jar). Also create a /data folder and insert a simple index.html file
For the generation of the audio files: SOX (http://sox.sourceforge.net/) and Lame (http://lame.sourceforge.net/download.php). Make sure to install the binaries (sox and lame) in /usr/local/bin
For the Volume control in MacOS: osascript and SwitchAudioSource (https://github.com/deweller/switchaudio-osx)
For the Volume control in Windows: SetVol (https://rlatour.com/setvol/)
Fot the Raspberrt Pi 3/4 we use the Raspbian distribuiton, NodeJS and Python. We also use the BerrytLan (http://www.berrylan.org/) APP + Service to setup de network connection without a mouse/keyboard/monitor
To turn on/turn off the TVs we use a linux package called cec-utils (sudo apt install cec-utils) in the Raspberry Pis
The Controller app has a package.json file with the dependecies. To install use npm install on the Controller folder
The Microphone, ScreenMirroring and LearningServer services and the Kiosk App have a requirements.txt file with the Python dependencies. To install, use "python3 -m pip install -r requirements.txt" on the LearningServer folder.
The system uses 3 external APIs of the IBM Cloud: Watson Assistant (WA), Speech-To-Text (STT) and Text-to-speech (TSS). Check the existings section in this READ for more details.
The script that starts the Controller: iniciaController_Catavento.sh
The script that starts all the Ravel Services+Microphone: \RavelCatavento\iniciaRavelCatavento.sh
The script that stop all the Ravel Services+Microphone: \RavelCatavento\ParaRavelCatavento.sh
Ravel Hub: 4000
Wrapper bot-a: 9060
Wrapper bot-b: 9061
Wrapper bot-c: 9062
TOC (Topic Classifier): 5000
Ravel States: 7000
ControlerCatavento: 80
Processing: 8000
Microphone Python: 6060
ScreenMirroring: 10000
LearningServer: 9090
MySQL: 3306
Redis: 6379
RobotControl: 7070
VoiceServer: 5555
Kiosk App: 1000
The Kiosk Desktop App is a python3 code that runs a browser in fullscreen. This is possible by the PyWebView dependecy (https://pywebview.flowrl.com/)
Change the kiosk_app.py file to setup de IP address of the Controller
To exit the Kiosk Desktop App with the Keyboard: -> In Ubunutu: just send ALT+F4 -> In WIndows: Hold ALT+TAB, choose the window of the browser and click in the 'X' of the upper right conner with the mouse
Monitor 3=BotA
Monitor 2=BotB
Monitor 1=BotC
If the bot is not in the right monitor adjust the parameter "--display" in the lines 136-145 of the Processingcatavento.pde in processing
An additional USB HUB should be plugged with 3 audio adapters (dongles) + 1 USB Microphone. The audio adapters have the following names:
-> BotA: Logicool G430 Gaming Headset
-> BotB: USB PnP Sound Device
-> BotC: USB PnP Sound Device
The default computer audio output ("Built-in Output") should be connected with an external speaker.
Ravel uses ElasticSearch and the Microphone microservice uses MySQL. Check the /SQL folder for scripts to create the MySQL Database and tables. For the ElasticSearch, check the /doc folder for scripts on how to configure the ElasticSerach index.
Ravel also uses Redis to store keys. Check the /RavelCatavento/redis-cli for more info and a script (script_atualiza_redis.sh) that insert the necessary keys
A MySQL database is used by the LearningServer for storing elements from all bot corpora stored in Watson Assistant, and the paths to the trained models for each bot. For retrieving bot credentials, the LearningServer reads them from the same Redis database used by Ravel.
5 voices where recorded for the initial prototype. The recording was made using the online demos and the Audacity+Windows WASAP recording method. Check the /Voices folder. The following online demos for the voices were used:
Isabela_Watson: https://text-to-speech-demo.ng.bluemix.net/
NVDA eSpeek: https://eeejay.github.io/espeak/emscripten/espeak.html
Felipe Nuance: https://www.nuance.com/omni-channel-customer-engagement/voice-and-ivr/text-to-speech.html#! NOTE: The pt-br "Felipe" nuance voice is also availabe for MacOS users without a cost. Just install it using this information: https://www.tekrevue.com/tip/make-your-mac-talk-say-command/
Gabriel Cereproc: https://www.cereproc.com/support/live_demo
Maria Microsoft: https://developer.microsoft.com/en-us/microsoft-edge/testdrive/demos/speechsynthesis/
Daniel Microsoft: check inte narator app installed in the SO
Google Translate Femave Voice: install SOX + libmad (Linux: $sudo apt-get install sox libsox-fmt-mp3, Windows: https://www.videohelp.com/software/SoX). Use the GoogleSpeech python lib (https://github.com/desbma/GoogleSpeech). Check the /Voices/tts_Google for an example
To generate the audio using the text to speech access via Script:
Isabela Watson: check the tts_watson.js file in /Voices/tts_watson
To use the eSpeak voice, first you need to install the eSpeak engine:
$brew install espeak
Then use the epeak binary sending the parameters i.e.:
$espeak "Esta é uma voz em português" -v brazil -w teste.wav
$say "Este é um teste de voz" -v Felipe -o out
Note: the output file format is aiff, which needs to be converted to mp3 using the lame binary
Other brazilian portuguese voice alternatives
To use the Microsoft Daniel voice:
a) Enabled it by changing the Registry (dan.reg)
b) Test if it is installed:
PowerShell -Command "Add-Type -AssemblyName System.Speech; $speak = New-Object System.Speech.Synthesis.SpeechSynthesizer;$speak.GetInstalledVoices().VoiceInfo; $speak.Dispose(); "
c) Use the Power shell to generate a wav file:
PowerShell -Command "Add-Type -AssemblyName System.Speech; $speak = New-Object System.Speech.Synthesis.SpeechSynthesizer;$speak.SetOutputToWaveFile('test.wav'); $speak.SelectVoice('Microsoft Daniel'); $speak.Speak('Teste de voz'); $speak.Dispose() "
Watson Speech to Text Web Demo: https://speech-to-text-demo.ng.bluemix.net/
Google Web API Speech to Text Web Demo (requires Chrome): https://www.google.com/intl/en/chrome/demos/speech.html
The IBM Cloud STT is used when the user reads the question to check if what's being said is close to what is written on the screen in the Microphone micro-service. The STT services used is called 'STT_Catavento' and it is hosted in the IBM Cloud under the Standard plan for the RIS5-BRL accound. To access this service use the following URL:
The order which the Robots anwser (first, second, third) depends on the topic of the anwser for a particular question. The topic is stored in the description of intent in the follow format: "<"topic">"|"<"subtopic">"
The following image shows the rules for order of the Robot anwser by each topic
If it is a spontaneous questions (no training):
Use the table to check the order of the answers
Use the following percentages for the confidences:
First to respond: 80% TO 90% (BASE)
Second to respond: 40% TO 70% (BASE)
Third to respond: 10% TO 30% (BASE)
If the question is made after training use the table to check the order of the answer
Scenario 1: Only ONE question has trained.
If it is the FIRST answer:
First to respond: 91% TO 99% (INCREASED)
Second to respond: 40% TO 70% (EQUAL TO BASE)
Third to respond: 10% TO 30% (EQUAL TO BASE)
If it is the SECOND answer:
First to respond: 80% TO 90% (EQUAL TO BASE)
Second to respond: 71% TO 79% (INCREASED)
Third to respond: 10% TO 30% (EQUAL TO BASE)
If it is the THIRD answer:
First to respond: 80% TO 90% (EQUAL TO BASE)
Second to respond: 40% TO 70% (EQUAL TO BASE)
Third to respond: 10% TO 30% (INCREASED)
Scenario 2: TWO questions have been trained.
If FIRST and SECOND:
First to respond: 91% TO 99% (INCREASED)
Second to respond: 71% TO 79% (INCREASED)
Third to respond: 10% TO 30% (EQUAL TO BASE)
If FIRST and THIRD:
First to respond: 91% TO 99% (INCREASED)
Second to respond: 40% TO 70% (EQUAL TO BASE)
Third to respond: 31% TO 39% (INCREASED)
If SECOND and THIRD:
First to respond: 80% TO 90% (EQUAL TO BASE)
Second to respond: 71% TO 79% (INCREASED)
Third to respond: 31% TO 39% (INCREASED)
Scenario 3: THREE questions have been trained.
First to respond: 91% TO 99% (INCREASED)
Second to respond: 71% TO 79% (INCREASED)
Third to respond: 31% TO 39% (INCREASED)
We used an Android (7.0+) APP called "StartAdroidAPP" to startup and shutdown the devices.
Check the diagram "Catavento_ArquiteturaStart_Shutdown.docx" in the /doc/AutomaticStart_Shutdown folder for the physical ethernet cable conection of the devices.
We use Wake Up on Lan in the computers (Windows + MacOS + RaspberryPi) and a Rele Arduino module to start/shutdow the devices.
A special entry point the ScreenSharing server is responsive to make the equipaments sleep (hibernate). The magic packet of the WOL wake them up.
We used an iPad APP called "Kiosk Mode for iPad". Check the following link:
https://itunes.apple.com/us/app/kiosk-mode-for-ipad/id986554705?mt=8
To avoid the user exiting the browser we used the nativel iOS Kiosk mode called "Guided Access". Check the following link for more information:
https://www.howtogeek.com/177366/how-to-lock-down-your-ipad-or-iphone-for-kids/
To get access to the login page which is a restricted area for the guided interaction mode, the user should touch on the corresponding area BEM of the welcome title in index page.
The offical Skills workspace for the bots in the catavento is in the RIS5-BRL account. The name of the service is 'Conversations-Catavento', it was created in the standard plan whithin the 'default' resource group of IBM Cloud and the link to access it is:
NOTE: In the new Watson Assistant Plug (WAP) version (2021) it is necessary to create a linked assistant in order to used. For the 3 brazilian bots we created the linked assistants and generated an API key to access them with the API via the ibm-cloud nodejs SDK.
It contains the following bots:
Skill Name: Bot-A-SMART
linked assistant: catavento-bot-a
Skill Name:Bot-B-FUNNY
linked assistant: catavento-bot-b
Skill Name:Bot-C-PATRONIZING
linked assistant: catavento-bot-c
Skill Name:Bot-A-Purple_CSCWDEMO
linked assistant: catavento-bota-CSCWDEMO
Skill Name:BotB-Yellow_CSCWDEMO
linked assistant: catavento-botb-CSCWDEMO
Skill Name: BotC-Green_CSCWDEMO
linked assistant: catavento-botc-CSCWDEMO
Skill Name:Bot_Catavento_vida
Skill Name:Bot_Catavento_vida_versao_multi
The IBM Catavento project is licensed under the Apache 2.0 license. Full license text is available at LICENSE.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。