Translated the "How to" for VOICEVOX.
"Reader" refers to the character that reads the text, with "character" referring to letters.
Striked text are my inputs.
How to useBefore we beginThis document is for learning how to use the text-to-speech voice synthesizer software
VOICEVOX.
Make sure to read the Terms and Conditions first:
https://voicevox.hiroshiba.jp/term/we also have a video that teaches you the basics of the program:
https://www.youtube.com/watch?v=4yVpklclxwUHow to runWindowsWhen trying to run this program on
Windows, you might get a warning dialog saying "Windows protected your PC". If you do, click on "More info" and choose "Run anyway".
[image]
MacDuring the first run of the program, you might get a dialog saying that this program is not registered with
Apple:
[image]
If so, on the
Finder, click on
VOICEVOX's icon while holding down the
Control key. From the shortcut menu, select "Open" and then click on "Open".
You can also select "System Environment Settings" from the
Apple Menu and from the
General tab select
something along the lines of "Open as is".
If you are on a system that runs on Apple Silicon:When trying to run this program for the first time, if prompted to install
Rosetta, please follow the install wizard and install it:
[image]
Running the voice synthesis engineThe first thing to start up is the voice synthesis engine, if you have an
NVIDIA GPU with at least 3GB of memory, you can use the GPU mode with its faster speeds.
* GPU mode is not available for
Mac.
[image]
Voice synthesisClick on the empty space to the right of the character icon to input text.
From now on referred to as "text row" with "input area" referring to the input are within.[image]
Press the
Enter key to confirm the text; doing so, you will see the readings and the accents for the text at the bottom of the screen.
From now on referred to as "customization area".[image]
Once you click on the "play" button the voice will first begin generating, then be played back.
Adding or removing textClicking on the "+" on the bottom right will add an empty text row.
Hovering over a text row, you will see a trash can icon appear, click that to delete the corresponding text row. You can also select multiple text rows at once.
[image]
Changing the readerClick on the reader icon(s) towards the left of the text row(s) and select a reader from the dropdown menu:
[image]
You can change the order of the readers through "キャラクター並べ替え".
It's the third option in the Settings menu; drag and drop the character names on the right of the screen.Changing the order of the text rowsYou can change the order of the text rows by clicking in the vicinity of the input area and dragging.
Changing the spacing between charactersWhen characters are unintentionally connected or separated, you can adjust this by clicking on the empty space between the characters in the "Accent" tab.
[image]
Clicking on the gap between the "two words":
[image]
you can turn it into a "single word".
Similarly, when you want to add a separation, click on the empty space between the characters:
[image]
Changing the accentIf the desired accent is not achieved, you can change it in two ways. The recommended way is to change the accent slider.
For example, if you want "Deeplearning" to be read as
"↑deepuraa↓ningu", drag the slider up to right above where the「ラ」character is:
[image]
Changing the intonationIf the desired outcome is not achieved even after adjusting the accents, or if you want to make more delicate changes, you can change the intonation of each individual character as well.
You can change the intonation for each character from the "Intonations" tab:
[image]
You can also increase the size of the customization area to make further detailed changes to the intonations:
[image]
You can also move the slider with the mouse wheel. Hold "Ctrl" while using the mouse wheel to lessen the amount of change with every scroll.
Further more, characters like 「キ」,「ツ」, and「ス」are muted, their sliders are grayed out in the
Intonation tab. You can unmute it by clicking on the character:
[image]
You can only mute/unmute characters that end with either an「イ」
"e"or an「ウ」
"oo".