Web Speech API (you can pretty much only say cat or dog since anything else returns 'image not found')
Start Speech Recognition
Speak Text