Frequently asked questions
Answers to the most common questions about the Aiva platform, technologies and integrations.
1.Core stack: which platform powers the bot (in-house, Google Dialogflow, Amazon Lex, Yandex Alisa, etc.)?
An in-house orchestrator. The modular architecture lets us plug in different ASR, TTS and LLM providers and integrate with external systems.
2.Whose technology powers speech-to-text? Does it support different accents, dialects, spontaneous speech with pauses and corrections?
We support multiple recognition engines: Google, Groq, OpenAI, Yandex, our own inference. The system handles accents, pauses, spontaneous speech and self-corrections reliably.
3.Whose technology powers text-to-speech? What voices are available (male, female, neutral)? Can timbre, speed and intonation be tuned?
Provider voices are used (including Google). 10+ voice options are available. Timbre, speed, pauses and intonation can be tuned via the prompt. Brand voice customization is possible.
4.How does the bot understand the dialog context? Can it sustain a multi-turn conversation with clarifications?
Yes. We use a scenario prompt, conversation memory and branching logic. We recommend building prompts based on real call recordings to cover all business scenarios.
5.How does it handle typos, slang and complex phrasing?
Errors, conversational constructs and slang are handled correctly. A vocabulary model can be tuned to a specific industry or business.
6.Are there built-in scenarios (intents) for an industry (e.g. doctor's bookings, table reservations, support)?
Each scenario gets its own prompt with integrations. For example, the agent can query a database for available slots and create a request or lead.
7.Which systems does it integrate with (CRM — AmoCRM, Bitrix24; telephony; 1C; knowledge bases; messengers)?
We support integrations with CRM systems, telephony, 1C, knowledge bases and messengers. Custom connectors can be developed on request.
8.Where and how is conversation data stored? Do you ensure compliance with 152-FZ (Russia) or GDPR?
Recordings and transcripts are stored on Russian servers.
9.Latency: what is the average response time from the end of the customer's utterance to the bot's reply (in milliseconds)?
Average latency is 600–700 ms (with a stable connection).
10.Throughput: how many simultaneous conversations can the system handle? How does it scale during traffic peaks (e.g. an advertising campaign)?
The system runs on Kubernetes and scales on demand for traffic peaks (campaigns, mass outreach, seasons).
11.Voice sources: are stock voices used (Google, Amazon, Yandex) or custom-built?
By default we use provider voices, and custom voices can be plugged in when needed.
12.Quality and naturalness: can we hear live samples (demo recordings)? How emotional and natural is the speech (neural TTS)?
Neural TTS models are used. Demo recordings can be provided. Speech quality keeps improving as model providers ship updates.
13.Can the voice be tuned to our brand (gender, age, character)?
Yes — we can pick the timbre, emotion, style and speech character.
14.Can a synthesized voice be created for a key employee or brand persona?
Yes, possible.
15.Can speech speed, key-word emphasis and pauses be adjusted?
Yes — speed, pauses, logical stress and emphasis are configurable.
16.Audio production: do you provide recording of welcome messages, jingles, background music?
Yes, we can.
17.Pricing model: how is the service priced (monthly/yearly subscription, per-minute, per successful dialog)?
Per-minute billing: 7–10 cents per minute. A one-time integration fee.
18.Implementation timeline: how long does setup and launch take for a typical / non-typical project?
Typical project: prompt setup — 1–3 days, telephony setup — 1–3 days. Non-typical projects depend on the integration scope.
19.Who configures the bot — us via the builder or your team?
There is a web interface for self-service setup. You can use your own team or engage ours for consulting and ongoing support.
20.Bot training: how does initial training happen? How easy is it to add new questions and answers after launch?
Initial training is based on real dialog recordings and business cases. Prompt edits apply instantly and reach production immediately.
21.Analytics and reports: what data does the system provide (recordings, transcripts, dialog map, metrics: resolution rate, hand-off reason, sentiment)?
Available: recordings, transcripts, intent detection, outcome classification, basic metrics. Extended analytics is in development.
22.Key advantage: what is your main differentiator from other market solutions (technology or outcome, not price)?
A flexible orchestrator architecture that lets us plug in best-in-class models and rapidly adapt to specific customer business processes.
Still have questions?
Contact us and we will answer all your questions about the Aiva platform.
