3 Star 0 Fork 0

mirrors_ibm/Catavento

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
Apache-2.0

Catavento

This is the main repository for the Catavento software. The following folders contain the projects used:

  • /ControllerCatavento: ->The NodeJS server that stores the UI, talk to Ravel and send the data to processing for rendering
  • /Docs -> General documentation of the project
  • /KioskApp -> Desktop Kiosk Application
  • /MicrophonePython -> The Python app that handles the microphone and the user audio recorded files
  • /LearningServer -> The Bots models implemented with a NLU Python dependency
  • /Processing_Catavento -> The main Processing UI visual application that runs on 4 monitors
  • /RavelCatavento -> Ravel Hub, Governance, Connectos, Wrappers, TOC, State services
  • /ScreenMirroring -> Client and Server Mirroring microservice (for the tutor session)
  • /RobotControl -> The Python app that handles the robot control (Eyes, Mounth, Head). Runs in the Raspberry Pi 3
  • /StartAndroidAPP -> The Android App (Java) that will start and shutdown the devices
  • /SQL -> Scripts to handle the data stored in the MySQL database
  • /Test -> Resources for testing the application (UI, etc)
  • /Voices -> Initial audio files with dialog for the voices (Watson_Isabela, NVDA's eSpeeak, Microsoft_Maria, Cereproc_Gabriel, Nuance_Felipe
  • /VoiceServer -> Python microservice to server the 'Daniel' Windows voice
  • /workspacesBots -> The JSON files of the bots

Design, experiments and other files

Files for the field research (desing fiction, datathon, voice recording) are stored in BOX in the subfolders of:

https://ibm.ent.box.com/folder/50333931736

Additional files for the UI design are stored in BOX at the following folder:

https://ibm.ent.box.com/folder/52753086573

Dependencies

The project use the following external dependencies:

  • The NodeJS server

  • Python 2.7 (for the TOC service) and Python 3.5 (for the Microphone, ScreenMirroring, Learning Server services and Desktop App)

  • MySQL and Redis

  • For the Processing the libraries: Minim, TimedEvents, 'HTTP Requests for Processing' and GifAnimation. Install this libraries in processing using the Sketch->Import Libraty...->Add library menu on the Processing IDE. The GifAnimation lib require the manual copy of the files in the /Libraires (Windows: C:\Users...\Documents\Processing\libraries, MacOS: /Users/ibm/Documents/Processing)

  • Processing also uses an external lib called 'Processing HTTP Server' available at https://diskordier.net/simpleHTTPServer/. Just create a /code folder in the sketch and copy the two .jar files (freemarker.jar and SimpleHTTPServer.jar). Also create a /data folder and insert a simple index.html file

  • For the generation of the audio files: SOX (http://sox.sourceforge.net/) and Lame (http://lame.sourceforge.net/download.php). Make sure to install the binaries (sox and lame) in /usr/local/bin

  • For the Volume control in MacOS: osascript and SwitchAudioSource (https://github.com/deweller/switchaudio-osx)

  • For the Volume control in Windows: SetVol (https://rlatour.com/setvol/)

  • Fot the Raspberrt Pi 3/4 we use the Raspbian distribuiton, NodeJS and Python. We also use the BerrytLan (http://www.berrylan.org/) APP + Service to setup de network connection without a mouse/keyboard/monitor

  • To turn on/turn off the TVs we use a linux package called cec-utils (sudo apt install cec-utils) in the Raspberry Pis

  • The Controller app has a package.json file with the dependecies. To install use npm install on the Controller folder

  • The Microphone, ScreenMirroring and LearningServer services and the Kiosk App have a requirements.txt file with the Python dependencies. To install, use "python3 -m pip install -r requirements.txt" on the LearningServer folder.

  • The system uses 3 external APIs of the IBM Cloud: Watson Assistant (WA), Speech-To-Text (STT) and Text-to-speech (TSS). Check the existings section in this READ for more details.

  • The script that starts the Controller: iniciaController_Catavento.sh

  • The script that starts all the Ravel Services+Microphone: \RavelCatavento\iniciaRavelCatavento.sh

  • The script that stop all the Ravel Services+Microphone: \RavelCatavento\ParaRavelCatavento.sh

Services and TCP Ports

Ravel Hub: 4000

Wrapper bot-a: 9060

Wrapper bot-b: 9061

Wrapper bot-c: 9062

TOC (Topic Classifier): 5000

Ravel States: 7000

ControlerCatavento: 80

Processing: 8000

Microphone Python: 6060

ScreenMirroring: 10000

LearningServer: 9090

MySQL: 3306

Redis: 6379

RobotControl: 7070

VoiceServer: 5555

Kiosk App: 1000

The Kiosk Desktop App

  • The Kiosk Desktop App is a python3 code that runs a browser in fullscreen. This is possible by the PyWebView dependecy (https://pywebview.flowrl.com/)

  • Change the kiosk_app.py file to setup de IP address of the Controller

  • To exit the Kiosk Desktop App with the Keyboard: -> In Ubunutu: just send ALT+F4 -> In WIndows: Hold ALT+TAB, choose the window of the browser and click in the 'X' of the upper right conner with the mouse

Monitor info for the testing environtment

  • Make sure to connect 4 monitors/TVs in the computer using the USB HUB and the USB <-> HDMI adapter
  • Make sure do NOT mirror de displays
  • The 3 vertical monitors should be rotated 90 degrees. The question monitor should be in horizontal setting without any rotation
  • The default resolution of the horizontal monitor is 1680 x 1050. The vertical monitors are 1050 x 1680
  • The correct order of the bots is:

Monitor 3=BotA

Monitor 2=BotB

Monitor 1=BotC

If the bot is not in the right monitor adjust the parameter "--display" in the lines 136-145 of the Processingcatavento.pde in processing

Audio info for the testing environtment

An additional USB HUB should be plugged with 3 audio adapters (dongles) + 1 USB Microphone. The audio adapters have the following names:

-> BotA: Logicool G430 Gaming Headset

-> BotB: USB PnP Sound Device

-> BotC: USB PnP Sound Device

The default computer audio output ("Built-in Output") should be connected with an external speaker.

Databases

Ravel uses ElasticSearch and the Microphone microservice uses MySQL. Check the /SQL folder for scripts to create the MySQL Database and tables. For the ElasticSearch, check the /doc folder for scripts on how to configure the ElasticSerach index.

Ravel also uses Redis to store keys. Check the /RavelCatavento/redis-cli for more info and a script (script_atualiza_redis.sh) that insert the necessary keys

A MySQL database is used by the LearningServer for storing elements from all bot corpora stored in Watson Assistant, and the paths to the trained models for each bot. For retrieving bot credentials, the LearningServer reads them from the same Redis database used by Ravel.

Voices

5 voices where recorded for the initial prototype. The recording was made using the online demos and the Audacity+Windows WASAP recording method. Check the /Voices folder. The following online demos for the voices were used:

  1. Isabela_Watson: https://text-to-speech-demo.ng.bluemix.net/

  2. NVDA eSpeek: https://eeejay.github.io/espeak/emscripten/espeak.html

  3. Felipe Nuance: https://www.nuance.com/omni-channel-customer-engagement/voice-and-ivr/text-to-speech.html#! NOTE: The pt-br "Felipe" nuance voice is also availabe for MacOS users without a cost. Just install it using this information: https://www.tekrevue.com/tip/make-your-mac-talk-say-command/

  4. Gabriel Cereproc: https://www.cereproc.com/support/live_demo

  5. Maria Microsoft: https://developer.microsoft.com/en-us/microsoft-edge/testdrive/demos/speechsynthesis/

  6. Daniel Microsoft: check inte narator app installed in the SO

  7. Google Translate Femave Voice: install SOX + libmad (Linux: $sudo apt-get install sox libsox-fmt-mp3, Windows: https://www.videohelp.com/software/SoX). Use the GoogleSpeech python lib (https://github.com/desbma/GoogleSpeech). Check the /Voices/tts_Google for an example

To generate the audio using the text to speech access via Script:

  1. Isabela Watson: check the tts_watson.js file in /Voices/tts_watson

  2. To use the eSpeak voice, first you need to install the eSpeak engine:

$brew install espeak

Then use the epeak binary sending the parameters i.e.:

$espeak "Esta é uma voz em português" -v brazil -w teste.wav

  1. To use the Mac OS ot-br "Felipe" voice, first install the pt-BR "Felipe" voice. Then use the say command:

$say "Este é um teste de voz" -v Felipe -o out

Note: the output file format is aiff, which needs to be converted to mp3 using the lame binary

Other brazilian portuguese voice alternatives

  1. To use the Microsoft Daniel voice:

    a) Enabled it by changing the Registry (dan.reg)

    b) Test if it is installed:

PowerShell -Command "Add-Type -AssemblyName System.Speech; $speak = New-Object System.Speech.Synthesis.SpeechSynthesizer;$speak.GetInstalledVoices().VoiceInfo; $speak.Dispose(); "

c) Use the Power shell to generate a wav file:

PowerShell -Command "Add-Type -AssemblyName System.Speech; $speak = New-Object System.Speech.Synthesis.SpeechSynthesizer;$speak.SetOutputToWaveFile('test.wav'); $speak.SelectVoice('Microsoft Daniel'); $speak.Speak('Teste de voz'); $speak.Dispose() "

IBM Text to Speech (TTS)

  • The IBM Cloud TSS is used as one of the Robot voices (Isabela voice for Robot A) when it speaks. It is implemented in the Voice microservice and there is some cache to avoid repeated calls. The TTS services used is called 'TTS_Catavento' and it is hosted in the IBM Cloud under the Standard plan for the RIS5-BRL accound.

Speech to Text (STT)

  • Watson Speech to Text Web Demo: https://speech-to-text-demo.ng.bluemix.net/

  • Google Web API Speech to Text Web Demo (requires Chrome): https://www.google.com/intl/en/chrome/demos/speech.html

  • The IBM Cloud STT is used when the user reads the question to check if what's being said is close to what is written on the screen in the Microphone micro-service. The STT services used is called 'STT_Catavento' and it is hosted in the IBM Cloud under the Standard plan for the RIS5-BRL accound. To access this service use the following URL:

Robot Anwsering Order

The order which the Robots anwser (first, second, third) depends on the topic of the anwser for a particular question. The topic is stored in the description of intent in the follow format: "<"topic">"|"<"subtopic">"

The following image shows the rules for order of the Robot anwser by each topic

Awnsering Order RUles

Confidence Values

Flowchart for the confidence selecting process

If it is a spontaneous questions (no training):

  • Use the table to check the order of the answers

  • Use the following percentages for the confidences:

    First to respond: 80% TO 90% (BASE)

    Second to respond: 40% TO 70% (BASE)

    Third to respond: 10% TO 30% (BASE)

If the question is made after training use the table to check the order of the answer

  • Scenario 1: Only ONE question has trained.

    If it is the FIRST answer:

     First to respond:      91% TO 99% (INCREASED)
      
     Second to respond:     40% TO 70% (EQUAL TO BASE)
     
     Third to respond:      10% TO 30% (EQUAL TO BASE)
    

    If it is the SECOND answer:

     First to respond:     80% TO 90% (EQUAL TO BASE)
     
     Second to respond:    71% TO 79% (INCREASED)
     
     Third to respond:     10% TO 30% (EQUAL TO BASE)
    

    If it is the THIRD answer:

    First to respond:      80% TO 90% (EQUAL TO BASE)
    
    Second to respond:     40% TO 70% (EQUAL TO BASE)
    
    Third to respond:      10% TO 30% (INCREASED)
    
  • Scenario 2: TWO questions have been trained.

    If FIRST and SECOND:

    First to respond:      91% TO 99% (INCREASED)
    
    Second to respond:     71% TO 79% (INCREASED)
    
    Third to respond:      10% TO 30% (EQUAL TO BASE)
    

    If FIRST and THIRD:

    First to respond:      91% TO 99% (INCREASED)
    
    Second to respond:     40% TO 70% (EQUAL TO BASE)
    
    Third to respond:      31% TO 39% (INCREASED)
    

    If SECOND and THIRD:

    First to respond:      80% TO 90% (EQUAL TO BASE)
    
    Second to respond:     71% TO 79% (INCREASED)
    
    Third to respond:      31% TO 39% (INCREASED)
    
  • Scenario 3: THREE questions have been trained.

    First to respond: 91% TO 99% (INCREASED)

    Second to respond: 71% TO 79% (INCREASED)

    Third to respond: 31% TO 39% (INCREASED)

StartAndroidAPP mobile APP

We used an Android (7.0+) APP called "StartAdroidAPP" to startup and shutdown the devices.

Check the diagram "Catavento_ArquiteturaStart_Shutdown.docx" in the /doc/AutomaticStart_Shutdown folder for the physical ethernet cable conection of the devices.

We use Wake Up on Lan in the computers (Windows + MacOS + RaspberryPi) and a Rele Arduino module to start/shutdow the devices.

A special entry point the ScreenSharing server is responsive to make the equipaments sleep (hibernate). The magic packet of the WOL wake them up.

iPad APP

We used an iPad APP called "Kiosk Mode for iPad". Check the following link:

https://itunes.apple.com/us/app/kiosk-mode-for-ipad/id986554705?mt=8

To avoid the user exiting the browser we used the nativel iOS Kiosk mode called "Guided Access". Check the following link for more information:

https://www.howtogeek.com/177366/how-to-lock-down-your-ipad-or-iphone-for-kids/

loginMonitor UI

To get access to the login page which is a restricted area for the guided interaction mode, the user should touch on the corresponding area BEM of the welcome title in index page.

Skills (WorkSpace) for the bots

The offical Skills workspace for the bots in the catavento is in the RIS5-BRL account. The name of the service is 'Conversations-Catavento', it was created in the standard plan whithin the 'default' resource group of IBM Cloud and the link to access it is:

NOTE: In the new Watson Assistant Plug (WAP) version (2021) it is necessary to create a linked assistant in order to used. For the 3 brazilian bots we created the linked assistants and generated an API key to access them with the API via the ibm-cloud nodejs SDK.

It contains the following bots:

Bot-A-SMART - PURPLE

Skill Name: Bot-A-SMART

linked assistant: catavento-bot-a

Bot-B-FUNNY - YELLOW

Skill Name:Bot-B-FUNNY

linked assistant: catavento-bot-b

Bot-C-PATRONIZING - GREEN

Skill Name:Bot-C-PATRONIZING

linked assistant: catavento-bot-c

SKILLS (WORKSPACE) FOR THE BOTS OF THE CSCW DEMO

Bot-A-Purple_CSCWDEMO:

Skill Name:Bot-A-Purple_CSCWDEMO

linked assistant: catavento-bota-CSCWDEMO

BotB-Yellow_CSCWDEMO

Skill Name:BotB-Yellow_CSCWDEMO

linked assistant: catavento-botb-CSCWDEMO

BotC-Green_CSCWDEMO

Skill Name: BotC-Green_CSCWDEMO

linked assistant: catavento-botc-CSCWDEMO

Bot_Catavento_vida

Skill Name:Bot_Catavento_vida

Bot_Catavento_vida_versao_multi

Skill Name:Bot_Catavento_vida_versao_multi

License

The IBM Catavento project is licensed under the Apache 2.0 license. Full license text is available at LICENSE.

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

简介

暂无描述 展开 收起
README
Apache-2.0
取消

发行版

暂无发行版

贡献者

全部

近期动态

不能加载更多了
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/mirrors_ibm/Catavento.git
git@gitee.com:mirrors_ibm/Catavento.git
mirrors_ibm
Catavento
Catavento
main

搜索帮助