thorstenMueller/deep-learning-german-tts 50
The free german voice dataset.
thorstenMueller/jetson-containers 1
Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
thorstenMueller/dockerfile2markdown 0
Generate md doc from dockerfile
thorstenMueller/i-spy-ml-edition 0
Machine learning for kids game "i spy with my little eye"
thorstenMueller/mimic-recording-studio 0
Mimic Recording Studio is a Docker-based application you can install to record voice samples, which can then be trained into a TTS voice with Mimic2
Text to Speech engine based on the Tacotron architecture, initially implemented by Keith Ito.
thorstenMueller/symcon-module-mycroft 0
IP-Symcon module to interact with MyCroft instances
thorstenMueller/symcon-python 0
Python module for ip-symcon api
thorstenMueller/symcon-skill 0
mycroft skill for ip-symcon integration
Generates a documentation of an IP-Symcon installation
issue commentmozilla/TTS
I don't think the error is on TTS but TF.
comment created time in 13 hours
issue openedmozilla/TTS
Hi! There is already a similar issue, but It's not exactly the same as mine.
Firstly, I trained the model from scratch, it's ok.
But when I try to continue to train from checkpoint use python TTS/bin/distribute.py --script train_vocoder_wavegrad.py --continue_path ../checkpoints/wavegrad-private-January-19-2021_10+16AM-d481fa2/
, I got RuntimeError.
Using CUDA: True Number of GPUs: 4 Training continues for ../checkpoints/wavegrad-private-January-19-2021_10+16AM-d481fa2/best_model.pth.tar Mixed precision is enabled Loading wavs from: /data2/datasets/wavs_all Setting up Audio Processor... | > sample_rate:22050 | > resample:True | > num_mels:80 | > min_level_db:-100 | > frame_shift_ms:None | > frame_length_ms:None | > ref_level_db:20 | > fft_size:1024 | > power:None | > preemphasis:0.98 | > griffin_lim_iters:None | > signal_norm:True | > symmetric_norm:True | > mel_fmin:0 | > mel_fmax:8000.0 | > spec_gain:20.0 | > stft_pad_mode:reflect | > max_norm:4.0 | > clip_norm:True | > do_trim_silence:True | > trim_db:30 | > do_sound_norm:False | > stats_path:None | > hop_length:256 | > win_length:1024 Generator Model: wavegrad Restoring Model... Restoring Optimizer... Restoring LR Scheduler... Restoring AMP Scaler... Model restored from step 20864 WaveGrad has 15827106 parameters
EPOCH: 0/10000
TRAINING (2021-01-20 09:16:06) Traceback (most recent call last): File "/data2/wavegrad_mozilla/TTS/TTS/bin/train_vocoder_wavegrad.py", line 501, in <module> main(args) File "/data2/wavegrad_mozilla/TTS/TTS/bin/train_vocoder_wavegrad.py", line 400, in main epoch) File "/data2/wavegrad_mozilla/TTS/TTS/bin/train_vocoder_wavegrad.py", line 137, in train scaler.step(optimizer) File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 294, in step retval = optimizer.step(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py", line 67, in wrapper return wrapped(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/autograd/grad_mode.py", line 15, in decorate_context return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/optim/adam.py", line 99, in step exp_avg.mul_(beta1).add_(grad, alpha=1 - beta1) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:2 and cpu!
Does anybody know how to solve this? Thanks a lot!
created time in 16 hours
startedpythonspeed/filprofiler
started time in 16 hours
issue commentmozilla/TTS
Hi. I am using master branch and I am running setup.py install (with Anaconda environment named TF2)
This is the trace now:
(TF2) C:\Users\luisv\Documents\MozillaTTS-master2\TTS>python TTS/bin/train_tacotron.py --config_path TTS/tts/configs/config.json 2021-01-19 16:36:00.668750: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found 2021-01-19 16:36:00.668938: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Using CUDA: False Number of GPUs: 0 Traceback (most recent call last): File "TTS/bin/train_tacotron.py", line 689, in <module> c = load_config(args.config_path) File "C:\Users\luisv\anaconda3\envs\TF2\lib\site-packages\tts-0.0.8+907cb98-py3.8-win-amd64.egg\TTS\utils\io.py", line 38, in load_config input_str = f.read() File "C:\Users\luisv\anaconda3\envs\TF2\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 3006: character maps to <undefined>
comment created time in a day
startedIdeefixze/deepspeech-hot-words-booster
started time in a day
startedIdeefixze/deepspeech-hot-words-booster
started time in a day
startedthorstenMueller/deep-learning-german-tts
started time in 2 days
startedwkentaro/gdown
started time in 2 days
startedthorstenMueller/deep-learning-german-tts
started time in 3 days
issue commentmozilla/TTS
One simple trick to run WaveGrad even faster.
- Generate a rough waveform with Griffin-Lim
- Give it to WaveGrad and use it instead of random noise as input.
It is able to generate way higher quality in 6 iterations compared to the normal run. And you can even generate good quality in 3 iterations.
@hello, I do the experiment in your way you suggestted, but i find waveform with Griffin-Lim is less than the noise ,and the missing size is hop_length . How do you deal with Griffin-Lim waveform ?
comment created time in 3 days
push eventmozilla/TTS
commit sha c96f7a2614ae336ac8b4c1657af444846719cd83
TorchSTFT to device fix
commit sha b2b4828f17f745856d6abb642f5cb8991da0edfe
set requires_grad=False
commit sha b70bef579a3953210dde51f73be7c95d97f914ab
Merge pull request #620 from gerazov/dev TorchSTFT to device fix
push time in 3 days
PR merged mozilla/TTS
This should close #619
pr closed time in 3 days
push eventmozilla/TTS
commit sha 23edc0533ff32c5699e7e5d3be36d4c385e0f2c6
Fix minor typo in README.md
commit sha 77b61455f87c26c27450b99344c365aa34d19f64
Merge pull request #621 from KathyReid/patch-1 Fix minor typo in README.md
push time in 3 days
PR merged mozilla/TTS
pr closed time in 3 days
PR opened mozilla/TTS
pr created time in 3 days
push eventrhasspy/hassio-addons
commit sha c9f06393c15a5a9983051ce7f96e913ba96c2681
Bump to 2.5.9
push time in 3 days
issue commentmozilla/TTS
@erogol The config present in your shared folder seems to be different from what is present 72a6ac5(The commit ID corresponding to Tacotron2 DDC in the wiki. Different in terms of fiels present in the json and not the values). Can you please confirm if I am trying to do same sort of thing on different language, which should be my point to start for training Tacotron2 part?
Thanks
comment created time in 4 days
PR opened mozilla/TTS
This should close #619
pr created time in 5 days
issue openedmozilla/TTS
torch.stft fails with "Expected all tensors to be on the same device"
When running vocoder training it fails with (whole Traceback at the end):
File ".../lib/python3.6/site-packages/torch/functional.py", line 516, in stft
normalized, onesided, return_complex)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
I've located the problem in the TorchSTFT
class in TTS/vocoder/losses.py
line 7:
class TorchSTFT():
def __init__(self, n_fft, hop_length, win_length, window='hann_window'):
""" Torch based STFT operation """
self.n_fft = n_fft
self.hop_length = hop_length
self.win_length = win_length
self.window = getattr(torch, window)(win_length)
The problem is with self.window
which doesn't get transferred to CUDA when the loss function gets transferred via criterion_gen.cuda()
in the TTS/bin/train_vocoder_gan.py
line 536.
I've managed to solve this by subclassing torch.nn.Module
and listing self.window
as a paramter. This way the .cuda()
will transfer the window to cuda and stft
will work:
class TorchSTFT(nn.Module):
def __init__(self, n_fft, hop_length, win_length, window='hann_window'):
""" Torch based STFT operation """
super(TorchSTFT, self).__init__()
self.n_fft = n_fft
self.hop_length = hop_length
self.win_length = win_length
self.window = nn.Parameter(getattr(torch, window)(win_length))
Here's the PR:
In it I've also added the parameter return_complex=False
because of the change in the default behaviour of torch.stft
.
I'm wandering if torch
just didn't report this in the previous version, and it just went ahead transffering to and from the cpu/gpu. In that sense this could speed up vocoder model training?
This is my environment:
# packages in environment at /home/vibe/miniconda3/envs/tts:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
absl-py 0.11.0 pypi_0 pypi
astroid 2.4.2 pypi_0 pypi
astunparse 1.6.3 pypi_0 pypi
attrdict 2.0.1 pypi_0 pypi
attrs 20.3.0 pypi_0 pypi
audioread 2.1.9 pypi_0 pypi
blas 1.0 mkl
bokeh 1.4.0 pypi_0 pypi
ca-certificates 2020.12.8 h06a4308_0
cachetools 4.2.0 pypi_0 pypi
cardboardlint 1.3.0 pypi_0 pypi
certifi 2020.12.5 py36h06a4308_0
cffi 1.14.4 pypi_0 pypi
chardet 4.0.0 pypi_0 pypi
click 7.1.2 pypi_0 pypi
clldutils 3.6.0 pypi_0 pypi
colorlog 4.6.2 pypi_0 pypi
csvw 1.9.0 pypi_0 pypi
cycler 0.10.0 pypi_0 pypi
cython 0.29.21 py36h2531618_0
dataclasses 0.8 pypi_0 pypi
decorator 4.4.2 pypi_0 pypi
filelock 3.0.12 pypi_0 pypi
flask 1.1.2 pypi_0 pypi
gast 0.3.3 pypi_0 pypi
google-auth 1.24.0 pypi_0 pypi
google-auth-oauthlib 0.4.2 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
grpcio 1.34.0 pypi_0 pypi
h5py 2.10.0 pypi_0 pypi
idna 2.10 pypi_0 pypi
importlib-metadata 3.3.0 pypi_0 pypi
inflect 5.0.2 pypi_0 pypi
intel-openmp 2020.2 254
isodate 0.6.0 pypi_0 pypi
isort 4.3.21 pypi_0 pypi
itsdangerous 1.1.0 pypi_0 pypi
jinja2 2.11.2 pypi_0 pypi
joblib 1.0.0 pypi_0 pypi
keras-preprocessing 1.1.2 pypi_0 pypi
kiwisolver 1.3.1 pypi_0 pypi
lazy-object-proxy 1.4.3 pypi_0 pypi
ld_impl_linux-64 2.33.1 h53a641e_7
libedit 3.1.20191231 h14c3975_1
libffi 3.3 he6710b0_2
libgcc-ng 9.1.0 hdf63c60_0
librosa 0.7.2 pypi_0 pypi
libstdcxx-ng 9.1.0 hdf63c60_0
llvmlite 0.31.0 pypi_0 pypi
markdown 3.3.3 pypi_0 pypi
markupsafe 1.1.1 pypi_0 pypi
matplotlib 3.3.3 pypi_0 pypi
mccabe 0.6.1 pypi_0 pypi
mkl 2020.2 256
mkl-service 2.3.0 py36he8ac12f_0
mkl_fft 1.2.0 py36h23d657b_0
mkl_random 1.1.1 py36h0573a6f_0
ncurses 6.2 he6710b0_1
nose 1.3.7 pypi_0 pypi
numba 0.48.0 pypi_0 pypi
numpy 1.18.5 pypi_0 pypi
oauthlib 3.1.0 pypi_0 pypi
openssl 1.1.1i h27cfd23_0
opt-einsum 3.3.0 pypi_0 pypi
packaging 20.8 pypi_0 pypi
phonemizer 2.2.2 pypi_0 pypi
pillow 8.1.0 pypi_0 pypi
pip 20.3.3 py36h06a4308_0
protobuf 3.14.0 pypi_0 pypi
pyasn1 0.4.8 pypi_0 pypi
pyasn1-modules 0.2.8 pypi_0 pypi
pycparser 2.20 pypi_0 pypi
pylint 2.5.3 pypi_0 pypi
pyparsing 2.4.7 pypi_0 pypi
pysbd 0.3.3 pypi_0 pypi
pysocks 1.7.1 pypi_0 pypi
python 3.6.12 hcff3b4d_2
python-dateutil 2.8.1 pypi_0 pypi
pyworld 0.2.12 pypi_0 pypi
pyyaml 5.3.1 pypi_0 pypi
readline 8.0 h7b6447c_0
regex 2020.11.13 pypi_0 pypi
requests 2.25.1 pypi_0 pypi
requests-oauthlib 1.3.0 pypi_0 pypi
resampy 0.2.2 pypi_0 pypi
rfc3986 1.4.0 pypi_0 pypi
rsa 4.6 pypi_0 pypi
scikit-learn 0.24.0 pypi_0 pypi
scipy 1.5.4 pypi_0 pypi
segments 2.2.0 pypi_0 pypi
setuptools 51.0.0 py36h06a4308_2
six 1.15.0 py36h06a4308_0
soundfile 0.10.3.post1 pypi_0 pypi
sqlite 3.33.0 h62c20be_0
tabulate 0.8.7 pypi_0 pypi
tensorboard 2.4.0 pypi_0 pypi
tensorboard-plugin-wit 1.7.0 pypi_0 pypi
tensorboardx 2.1 pypi_0 pypi
tensorflow 2.3.1 pypi_0 pypi
tensorflow-estimator 2.3.0 pypi_0 pypi
termcolor 1.1.0 pypi_0 pypi
threadpoolctl 2.1.0 pypi_0 pypi
tk 8.6.10 hbc83047_0
toml 0.10.2 pypi_0 pypi
torch 1.7.1 pypi_0 pypi
tornado 6.1 pypi_0 pypi
tqdm 4.55.1 pypi_0 pypi
tts 0.0.6+9cf474a pypi_0 pypi
typed-ast 1.4.2 pypi_0 pypi
typing-extensions 3.7.4.3 pypi_0 pypi
umap-learn 0.4.6 pypi_0 pypi
unidecode 0.04.20 pypi_0 pypi
uritemplate 3.0.1 pypi_0 pypi
urllib3 1.26.2 pypi_0 pypi
werkzeug 1.0.1 pypi_0 pypi
wheel 0.36.2 pyhd3eb1b0_0
wrapt 1.12.1 pypi_0 pypi
xz 5.2.5 h7b6447c_0
zipp 3.4.0 pypi_0 pypi
zlib 1.2.11 h7b6447c_3
And here's the whole Traceback:
$ python TTS/bin/train_vocoder_gan.py --config_path TTS/vocoder/configs/my_parallel_wavegan_config.json
> Using CUDA: True
> Number of GPUs: 1
> Git Hash: 7beaacc
> Experiment folder: /home/vibe/tts/mozilla/Models/LJSpeech/pwgan-January-16-2021_11+48AM-7beaacc
> Loading wavs from: /home/vibe/tts/databases/LJSpeech-1.1/wavs/
> Setting up Audio Processor...
| > sample_rate:22050
| > resample:False
| > num_mels:80
| > min_level_db:-100
| > frame_shift_ms:None
| > frame_length_ms:None
| > ref_level_db:0
| > fft_size:1024
| > power:None
| > preemphasis:0.0
| > griffin_lim_iters:None
| > signal_norm:True
| > symmetric_norm:True
| > mel_fmin:50.0
| > mel_fmax:7600.0
| > spec_gain:1.0
| > stft_pad_mode:reflect
| > max_norm:4.0
| > clip_norm:True
| > do_trim_silence:True
| > trim_db:60
| > do_sound_norm:False
| > stats_path:/home/vibe/tts/databases/LJSpeech-1.1/scale_stats.npy
| > hop_length:256
| > win_length:1024
> Generator Model: parallel_wavegan_generator
> Discriminator Model: parallel_wavegan_discriminator
> Generator has 1320442 parameters
> Discriminator has 99842 parameters
> EPOCH: 0/10000
> TRAINING (2021-01-16 11:48:49)
/home/vibe/miniconda3/envs/tts/lib/python3.6/site-packages/torch/functional.py:516: UserWarning: stft will require the return_complex parameter be explicitly specified in a future PyTorch release. Use return_complex=False to preserve the current behavior or return_complex=True to return a complex output. (Triggered internally at /pytorch/aten/src/ATen/native/SpectralOps.cpp:653.)
normalized, onesided, return_complex)
! Run is removed from /home/vibe/tts/mozilla/Models/LJSpeech/pwgan-January-16-2021_11+48AM-7beaacc
Traceback (most recent call last):
File "TTS/bin/train_vocoder_gan.py", line 654, in <module>
main(args)
File "TTS/bin/train_vocoder_gan.py", line 559, in main
epoch)
File "TTS/bin/train_vocoder_gan.py", line 152, in train
feats_real, y_hat_sub, y_G_sub)
File "/home/vibe/miniconda3/envs/tts/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/vibe/tts/mozilla/TTS_gerazov/TTS/vocoder/layers/losses.py", line 233, in forward
stft_loss_mg, stft_loss_sc = self.stft_loss(y_hat.squeeze(1), y.squeeze(1))
File "/home/vibe/miniconda3/envs/tts/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/vibe/tts/mozilla/TTS_gerazov/TTS/vocoder/layers/losses.py", line 70, in forward
lm, lsc = f(y_hat, y)
File "/home/vibe/miniconda3/envs/tts/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/vibe/tts/mozilla/TTS_gerazov/TTS/vocoder/layers/losses.py", line 46, in forward
y_hat_M = self.stft(y_hat)
File "/home/vibe/tts/mozilla/TTS_gerazov/TTS/vocoder/layers/losses.py", line 25, in __call__
onesided=True)
File "/home/vibe/miniconda3/envs/tts/lib/python3.6/site-packages/torch/functional.py", line 516, in stft
normalized, onesided, return_complex)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
created time in 5 days
startedSnugface/alx
started time in 5 days
issue openedmozilla/TTS
Training error when using "bidirectional_decoder" equals true
when I set parameter "biderectional_decoder" equal to true, my training gives an error:
Traceback (most recent call last): File "/home/user/projects/ASR/TTS-Mozilla/TTS/bin/train_tts.py", line 705, in <module> main(args) File "/home/user/projects/ASR/TTS-Mozilla/TTS/bin/train_tts.py", line 617, in main global_step, epoch, amp, speaker_mapping) File "/home/user/projects/ASR/TTS-Mozilla/TTS/bin/train_tts.py", line 162, in train text_input, text_lengths, mel_input, mel_lengths, speaker_ids=speaker_ids, speaker_embeddings=speaker_embeddings) File "/home/user/anaconda3/envs/tts-mozilla/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "/home/user/projects/ASR/TTS-Mozilla/TTS/tts/models/tacotron2.py", line 133, in forward decoder_outputs_backward, alignments_backward = self._backward_pass(mel_specs, encoder_outputs, input_mask) File "/home/user/projects/ASR/TTS-Mozilla/TTS/tts/models/tacotron_abstract.py", line 143, in _backward_pass self.speaker_embeddings_projected) File "/home/user/anaconda3/envs/tts-mozilla/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) TypeError: forward() takes 4 positional arguments but 5 were given
created time in 5 days
startedfiles-community/Files
started time in 5 days
fork MarcGrotheer/deep-learning-german-tts
The free german voice dataset.
https://thorstenmueller.github.io/deep-learning-german-tts/
fork in 5 days
startedjitsi/jiwer
started time in 5 days
issue openedmozilla/TTS
Hi, I am trying to train my model from scratch, but the message of Python was the folloring (I haven't changed the original code of the project)
python TTS/bin/train_tacotron.py --config_path TTS/tts/configs/config.json
Traceback (most recent call last): File "TTS/bin/train_tacotron.py", line 18, in <module> from TTS.tts.utils.generic_utils import check_config_tts, setup_model ImportError: cannot import name 'check_config_tts' from 'TTS.tts.utils.generic_utils' (C:\Users\luisv\anaconda3\envs\TF2\lib\site-packages\tts-0.0.4-py3.8.egg\TTS\tts\utils\generic_utils.py)
created time in 6 days
startedalexal1/Insomniac
started time in 6 days
startedveliovgroup/Meteor-Files
started time in 6 days