FireRedASR Pro Java集成开发指南:SpringBoot微服务语音处理实战
FireRedASR Pro Java集成开发指南SpringBoot微服务语音处理实战如果你是一个Java后端开发者最近接到了要给系统加上语音识别功能的需求比如处理用户上传的客服录音或者分析会议纪要那你可能正在寻找一个既稳定又好集成的方案。FireRedASR Pro作为一个高性能的语音识别服务正好能派上用场。但怎么把它优雅地塞进你的SpringBoot微服务里让它既能处理短语音快速响应又能搞定长音频不超时还能和现有的消息队列、数据库玩到一起这里面确实有些门道。这篇文章我就以一个过来人的身份跟你聊聊怎么一步步把FireRedASR Pro集成到SpringBoot项目里。我们不谈空洞的理论直接上代码从最简单的HTTP调用开始到设计异步任务处理长文件再到用消息队列做流式处理最后打包成Docker镜像。跟着走一遍你就能得到一个生产可用的语音处理微服务模块。1. 项目初始化与基础依赖首先我们得把项目架子搭起来。这里假设你已经有SpringBoot的开发经验我们创建一个标准的SpringBoot Web项目。打开你的IDE或者直接用Spring Initializr选择这些依赖Spring Web用于提供RESTful API接口。Spring Boot DevTools开发工具可选但推荐。Lombok减少样板代码让Data、Slf4j这些注解来帮忙。当然我们还需要一些额外的库来处理HTTP请求、JSON解析和可能的gRPC通信。在pom.xml里除了SpringBoot的starter手动加上这些dependencies !-- Spring Boot Starters -- dependency groupIdorg.springframework.boot/groupId artifactIdspring-boot-starter-web/artifactId /dependency !-- HTTP客户端 - 我们使用OkHttp你也可以用RestTemplate或WebClient -- dependency groupIdcom.squareup.okhttp3/groupId artifactIdokhttp/artifactId version4.12.0/version /dependency !-- JSON处理 -- dependency groupIdcom.fasterxml.jackson.core/groupId artifactIdjackson-databind/artifactId /dependency !-- Lombok -- dependency groupIdorg.projectlombok/groupId artifactIdlombok/artifactId optionaltrue/optional /dependency !-- 测试 -- dependency groupIdorg.springframework.boot/groupId artifactIdspring-boot-starter-test/artifactId scopetest/scope /dependency /dependencies项目结构大概长这样src/main/java/com/yourcompany/asr/ ├── Application.java ├── config/ ├── controller/ ├── service/ │ ├── impl/ │ └── AsrService.java ├── client/ │ └── FireRedAsrClient.java ├── task/ ├── entity/ └── dto/基础框架准备好后我们来配置一下FireRedASR Pro服务的基本信息。在application.yml里加一段配置# FireRedASR Pro 服务配置 fire-red: asr: # HTTP API 端点 (根据你的实际部署地址修改) http-endpoint: http://your-firered-asr-server:8000/v1/recognize # gRPC 服务地址 (如果使用gRPC) grpc-endpoint: your-firered-asr-server:50051 # API密钥或认证令牌 (如果需要) api-key: your-api-key-here # 默认识别语言 default-language: zh-CN # 连接超时时间(毫秒) connect-timeout: 10000 # 读取超时时间(毫秒) read-timeout: 300002. 构建语音识别HTTP客户端FireRedASR Pro通常提供HTTP API这是我们最常用的集成方式。我们来创建一个客户端封装所有与识别服务交互的细节。首先定义两个简单的DTO数据传输对象来映射请求和响应。// src/main/java/com/yourcompany/asr/dto/AsrRequest.java package com.yourcompany.asr.dto; import lombok.Data; Data public class AsrRequest { /** * 音频文件的Base64编码字符串 */ private String audioData; /** * 音频格式如 wav, mp3, pcm */ private String audioFormat wav; /** * 识别语言如 zh-CN, en-US */ private String language; /** * 是否启用标点 */ private Boolean enablePunctuation true; // 其他可能的参数如说话人分离、热词等 }// src/main/java/com/yourcompany/asr/dto/AsrResponse.java package com.yourcompany.asr.dto; import lombok.Data; import java.util.List; Data public class AsrResponse { /** * 识别状态码0表示成功 */ private Integer code; /** * 状态消息 */ private String message; /** * 识别出的文本结果 */ private String text; /** * 识别置信度 (0-1) */ private Double confidence; /** * 如果启用了说话人分离这里会是分段的文本 */ private ListSegment segments; /** * 处理耗时(毫秒) */ private Long processTime; Data public static class Segment { private String speaker; private String text; private Long startTime; private Long endTime; } }接下来是核心的客户端类。我们使用OkHttp因为它简单高效。// src/main/java/com/yourcompany/asr/client/FireRedAsrClient.java package com.yourcompany.asr.client; import com.fasterxml.jackson.databind.ObjectMapper; import com.yourcompany.asr.dto.AsrRequest; import com.yourcompany.asr.dto.AsrResponse; import lombok.extern.slf4j.Slf4j; import okhttp3.*; import org.springframework.beans.factory.annotation.Value; import org.springframework.stereotype.Component; import javax.annotation.PostConstruct; import java.io.IOException; import java.util.concurrent.TimeUnit; Slf4j Component public class FireRedAsrClient { Value(${fire-red.asr.http-endpoint}) private String httpEndpoint; Value(${fire-red.asr.api-key:}) private String apiKey; Value(${fire-red.asr.connect-timeout:10000}) private int connectTimeout; Value(${fire-red.asr.read-timeout:30000}) private int readTimeout; private OkHttpClient httpClient; private final ObjectMapper objectMapper new ObjectMapper(); public static final MediaType JSON MediaType.get(application/json; charsetutf-8); PostConstruct public void init() { // 初始化HTTP客户端配置超时时间 this.httpClient new OkHttpClient.Builder() .connectTimeout(connectTimeout, TimeUnit.MILLISECONDS) .readTimeout(readTimeout, TimeUnit.MILLISECONDS) .build(); log.info(FireRedASR HTTP客户端初始化完成端点: {}, httpEndpoint); } /** * 同步识别短音频适合60秒以内的音频 */ public AsrResponse recognizeShortAudio(AsrRequest request) throws IOException { // 构建请求体 String requestJson objectMapper.writeValueAsString(request); RequestBody body RequestBody.create(requestJson, JSON); // 构建请求 Request.Builder requestBuilder new Request.Builder() .url(httpEndpoint) .post(body); // 添加认证头如果需要 if (apiKey ! null !apiKey.trim().isEmpty()) { requestBuilder.addHeader(Authorization, Bearer apiKey); } Request httpRequest requestBuilder.build(); // 发送请求并处理响应 try (Response response httpClient.newCall(httpRequest).execute()) { if (!response.isSuccessful()) { log.error(ASR请求失败状态码: {}, 消息: {}, response.code(), response.message()); throw new IOException(ASR服务请求失败: response.code()); } String responseBody response.body().string(); return objectMapper.readValue(responseBody, AsrResponse.class); } catch (IOException e) { log.error(调用ASR服务时发生IO异常, e); throw e; } } /** * 一个更简单的工具方法直接传入Base64音频数据和语言 */ public String recognizeSimple(String audioBase64, String language) { try { AsrRequest request new AsrRequest(); request.setAudioData(audioBase64); request.setLanguage(language); request.setAudioFormat(wav); // 假设是wav格式 AsrResponse response recognizeShortAudio(request); if (response ! null response.getCode() 0) { return response.getText(); } else { log.warn(识别失败响应: {}, response); return null; } } catch (Exception e) { log.error(语音识别异常, e); return null; } } }这个客户端类做了几件事读取配置、初始化HTTP连接池、封装请求/响应的序列化与反序列化并提供了一个简单的同步调用方法。对于短音频比如一分钟以内的语音消息这种方式直接又高效。3. 设计异步任务处理长音频现实场景中你遇到的更多是长达半小时的会议录音或者一小时的访谈音频。让HTTP请求同步等待这么长时间是不现实的会超时也会阻塞服务线程。所以我们必须引入异步任务机制。Spring Boot提供了Async注解可以很方便地实现异步方法调用。我们先来配置一个异步任务执行器。// src/main/java/com/yourcompany/asr/config/AsyncConfig.java package com.yourcompany.asr.config; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.scheduling.annotation.EnableAsync; import org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor; import java.util.concurrent.Executor; Configuration EnableAsync public class AsyncConfig { Bean(name asrTaskExecutor) public Executor asrTaskExecutor() { ThreadPoolTaskExecutor executor new ThreadPoolTaskExecutor(); // 核心线程数即使空闲也保留的线程数 executor.setCorePoolSize(5); // 最大线程数 executor.setMaxPoolSize(20); // 队列容量 executor.setQueueCapacity(100); // 线程名前缀 executor.setThreadNamePrefix(AsrAsync-); // 当队列满时调用者线程直接执行该任务 executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy()); executor.initialize(); return executor; } }接下来我们需要一个地方来存储和管理这些异步任务的状态和结果。简单起见我们可以用一个内存中的ConcurrentHashMap但对于生产环境你肯定需要用到数据库比如MySQL或者分布式缓存比如Redis。这里我们先展示内存版本并给出数据库版本的思路。首先定义一个任务状态枚举和实体。// src/main/java/com/yourcompany/asr/entity/AsrTask.java package com.yourcompany.asr.entity; import lombok.Data; import java.time.LocalDateTime; Data public class AsrTask { /** * 任务ID */ private String taskId; /** * 任务状态 */ private TaskStatus status; /** * 提交的音频文件标识如OSS的URL或文件路径 */ private String audioSource; /** * 识别结果文本 */ private String resultText; /** * 错误信息 */ private String errorMsg; /** * 任务创建时间 */ private LocalDateTime createTime; /** * 任务完成时间 */ private LocalDateTime finishTime; public enum TaskStatus { PENDING, // 等待中 PROCESSING, // 处理中 SUCCESS, // 成功 FAILED // 失败 } }然后创建服务层。这个服务负责提交任务、查询任务状态并包含真正的异步识别逻辑。// src/main/java/com/yourcompany/asr/service/AsrService.java package com.yourcompany.asr.service; import com.yourcompany.asr.entity.AsrTask; public interface AsrService { /** * 提交一个长音频识别任务异步 */ String submitLongAudioTask(String audioUrl, String language); /** * 查询任务状态和结果 */ AsrTask getTaskStatus(String taskId); }// src/main/java/com/yourcompany/asr/service/impl/AsrServiceImpl.java package com.yourcompany.asr.service.impl; import com.yourcompany.asr.client.FireRedAsrClient; import com.yourcompany.asr.dto.AsrRequest; import com.yourcompany.asr.dto.AsrResponse; import com.yourcompany.asr.entity.AsrTask; import com.yourcompany.asr.service.AsrService; import lombok.extern.slf4j.Slf4j; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.scheduling.annotation.Async; import org.springframework.stereotype.Service; import javax.annotation.PostConstruct; import java.time.LocalDateTime; import java.util.Map; import java.util.UUID; import java.util.concurrent.ConcurrentHashMap; Slf4j Service public class AsrServiceImpl implements AsrService { Autowired private FireRedAsrClient asrClient; // 内存存储任务信息生产环境请替换为数据库 private final MapString, AsrTask taskStore new ConcurrentHashMap(); PostConstruct public void init() { log.info(ASR服务初始化完成使用异步任务处理长音频。); } Override public String submitLongAudioTask(String audioUrl, String language) { // 生成唯一任务ID String taskId ASR_ UUID.randomUUID().toString().replace(-, ).substring(0, 16); AsrTask task new AsrTask(); task.setTaskId(taskId); task.setAudioSource(audioUrl); task.setStatus(AsrTask.TaskStatus.PENDING); task.setCreateTime(LocalDateTime.now()); // 存储任务 taskStore.put(taskId, task); log.info(提交长音频识别任务ID: {}, 音频: {}, taskId, audioUrl); // 触发异步处理 processAudioAsync(taskId, audioUrl, language); return taskId; } /** * 核心的异步处理方法 * 使用自定义的线程池执行器 */ Async(asrTaskExecutor) public void processAudioAsync(String taskId, String audioUrl, String language) { AsrTask task taskStore.get(taskId); if (task null) { log.error(任务不存在: {}, taskId); return; } task.setStatus(AsrTask.TaskStatus.PROCESSING); log.info(开始处理长音频任务: {}, taskId); try { // 1. 从 audioUrl 下载音频文件这里需要你实现下载逻辑 // byte[] audioBytes downloadAudio(audioUrl); // 2. 将音频转为Base64如果是文件流可能需要分片或使用其他API // String audioBase64 Base64.getEncoder().encodeToString(audioBytes); // 3. 调用ASR服务 // AsrRequest request new AsrRequest(); // request.setAudioData(audioBase64); // request.setLanguage(language); // AsrResponse response asrClient.recognizeShortAudio(request); // 模拟一个长时间处理过程 Thread.sleep(10000); // 模拟10秒处理时间 AsrResponse mockResponse new AsrResponse(); mockResponse.setCode(0); mockResponse.setText(这是模拟识别出的长音频文本内容。音频来自: audioUrl); // 4. 更新任务结果 task.setResultText(mockResponse.getText()); task.setStatus(AsrTask.TaskStatus.SUCCESS); task.setFinishTime(LocalDateTime.now()); log.info(长音频任务处理成功: {}, taskId); } catch (Exception e) { log.error(处理长音频任务失败: {}, taskId, e); task.setStatus(AsrTask.TaskStatus.FAILED); task.setErrorMsg(e.getMessage()); task.setFinishTime(LocalDateTime.now()); } } Override public AsrTask getTaskStatus(String taskId) { return taskStore.get(taskId); } // 生产环境需要实现的方法 // private byte[] downloadAudio(String url) { ... } }最后我们提供一个简单的REST API控制器让外部可以提交任务和查询结果。// src/main/java/com/yourcompany/asr/controller/AsrController.java package com.yourcompany.asr.controller; import com.yourcompany.asr.dto.AsrRequest; import com.yourcompany.asr.dto.AsrResponse; import com.yourcompany.asr.entity.AsrTask; import com.yourcompany.asr.service.AsrService; import com.yourcompany.asr.client.FireRedAsrClient; import lombok.extern.slf4j.Slf4j; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.web.bind.annotation.*; import java.io.IOException; Slf4j RestController RequestMapping(/api/asr) public class AsrController { Autowired private FireRedAsrClient asrClient; Autowired private AsrService asrService; /** * 同步识别接口适合短音频 */ PostMapping(/recognize) public AsrResponse recognize(RequestBody AsrRequest request) { log.info(收到同步识别请求格式: {}, 语言: {}, request.getAudioFormat(), request.getLanguage()); try { return asrClient.recognizeShortAudio(request); } catch (IOException e) { AsrResponse errorResp new AsrResponse(); errorResp.setCode(500); errorResp.setMessage(识别服务调用失败: e.getMessage()); return errorResp; } } /** * 提交长音频异步识别任务 */ PostMapping(/task/submit) public String submitTask(RequestParam String audioUrl, RequestParam(defaultValue zh-CN) String language) { log.info(提交异步识别任务音频URL: {}, audioUrl); return asrService.submitLongAudioTask(audioUrl, language); } /** * 查询异步任务状态 */ GetMapping(/task/status/{taskId}) public AsrTask getTaskStatus(PathVariable String taskId) { return asrService.getTaskStatus(taskId); } }这样一个支持同步短音频识别和异步长音频任务处理的微服务骨架就搭好了。用户上传长音频后立即得到一个任务ID然后可以通过轮询这个ID来获取识别结果。4. 集成消息队列实现流式处理当语音识别的量上来之后比如来自多个客服渠道的实时语音流单纯的HTTP请求和内存队列可能就不够用了。这时引入消息队列比如Kafka来做解耦和缓冲是个好主意。架构会变成这样语音流先被推送到Kafka然后我们的识别服务作为消费者从Kafka拉取任务进行处理再把结果写回另一个结果主题。首先在pom.xml中加入Kafka的依赖。dependency groupIdorg.springframework.kafka/groupId artifactIdspring-kafka/artifactId /dependency在application.yml中配置Kafka。spring: kafka: bootstrap-servers: localhost:9092 consumer: group-id: asr-service-group auto-offset-reset: earliest key-deserializer: org.apache.kafka.common.serialization.StringDeserializer value-deserializer: org.apache.kafka.common.serialization.StringDeserializer producer: key-serializer: org.apache.kafka.common.serialization.StringSerializer value-serializer: org.apache.kafka.common.serialization.StringSerializer # 自定义主题 asr: kafka: topic: audio-input: asr.audio.input result-output: asr.result.output然后我们创建一个Kafka消费者服务监听音频输入主题。// src/main/java/com/yourcompany/asr/service/impl/KafkaAsrConsumerService.java package com.yourcompany.asr.service.impl; import com.fasterxml.jackson.databind.ObjectMapper; import com.yourcompany.asr.client.FireRedAsrClient; import lombok.extern.slf4j.Slf4j; import org.apache.kafka.clients.consumer.ConsumerRecord; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.beans.factory.annotation.Value; import org.springframework.kafka.annotation.KafkaListener; import org.springframework.kafka.core.KafkaTemplate; import org.springframework.stereotype.Service; import java.util.Map; Slf4j Service public class KafkaAsrConsumerService { Autowired private FireRedAsrClient asrClient; Autowired private KafkaTemplateString, String kafkaTemplate; Autowired private ObjectMapper objectMapper; Value(${asr.kafka.topic.result-output}) private String resultTopic; /** * 监听音频输入主题进行识别处理 */ KafkaListener(topics ${asr.kafka.topic.audio-input}) public void consumeAudioMessage(ConsumerRecordString, String record) { String messageKey record.key(); String messageValue record.value(); log.info(收到音频处理消息Key: {}, Partition: {}, messageKey, record.partition()); try { // 解析消息假设消息体是JSON包含audioData和language MapString, String audioMsg objectMapper.readValue(messageValue, Map.class); String audioBase64 audioMsg.get(audioData); String language audioMsg.getOrDefault(language, zh-CN); // 调用识别服务 String recognizedText asrClient.recognizeSimple(audioBase64, language); // 构建结果消息 MapString, Object resultMsg Map.of( messageId, messageKey, status, SUCCESS, recognizedText, recognizedText, timestamp, System.currentTimeMillis() ); String resultJson objectMapper.writeValueAsString(resultMsg); // 将识别结果发送到结果主题 kafkaTemplate.send(resultTopic, messageKey, resultJson); log.info(语音识别完成结果已发送。消息Key: {}, messageKey); } catch (Exception e) { log.error(处理Kafka音频消息失败: {}, messageKey, e); // 可以发送一个错误状态的结果消息 MapString, Object errorMsg Map.of( messageId, messageKey, status, FAILED, error, e.getMessage() ); try { kafkaTemplate.send(resultTopic, messageKey, objectMapper.writeValueAsString(errorMsg)); } catch (Exception ex) { log.error(发送错误结果消息失败, ex); } } } }同时你也可以提供一个生产者控制器接收HTTP请求并将任务发布到Kafka。// src/main/java/com/yourcompany/asr/controller/AsrStreamController.java package com.yourcompany.asr.controller; import com.fasterxml.jackson.databind.ObjectMapper; import lombok.extern.slf4j.Slf4j; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.beans.factory.annotation.Value; import org.springframework.kafka.core.KafkaTemplate; import org.springframework.web.bind.annotation.*; import java.util.Map; import java.util.UUID; Slf4j RestController RequestMapping(/api/asr/stream) public class AsrStreamController { Autowired private KafkaTemplateString, String kafkaTemplate; Autowired private ObjectMapper objectMapper; Value(${asr.kafka.topic.audio-input}) private String audioInputTopic; PostMapping(/submit) public String submitToStream(RequestBody MapString, Object request) { String messageId MSG_ UUID.randomUUID().toString().substring(0, 8); try { String messageJson objectMapper.writeValueAsString(request); // 发送到Kafka主题 kafkaTemplate.send(audioInputTopic, messageId, messageJson).get(); // 使用get()确保发送成功 log.info(语音识别任务已提交至消息队列MessageId: {}, messageId); return messageId; } catch (Exception e) { log.error(提交任务到消息队列失败, e); return error: e.getMessage(); } } }这样一来你的服务就具备了流式处理能力能够应对更高的并发和更稳定的任务处理流程。5. 容器化部署与配置最后我们来谈谈怎么把这个服务部署出去。Docker是目前最流行的方式。首先在项目根目录创建一个Dockerfile。# 使用官方的OpenJDK 11作为基础镜像 FROM openjdk:11-jre-slim # 设置工作目录 WORKDIR /app # 将Maven构建好的jar包复制到容器中 # 假设你的jar包名为 asr-service-0.0.1-SNAPSHOT.jar COPY target/asr-service-0.0.1-SNAPSHOT.jar app.jar # 暴露端口Spring Boot默认8080 EXPOSE 8080 # 设置JVM参数例如堆内存大小 ENV JAVA_OPTS-Xms512m -Xmx1024m -Dspring.profiles.activeprod # 启动应用 ENTRYPOINT [sh, -c, java $JAVA_OPTS -jar app.jar]然后我们可以用一个docker-compose.yml文件来定义服务以及它可能依赖的组件比如Kafka。这特别适合本地开发测试。version: 3.8 services: # 我们的ASR微服务 asr-service: build: . container_name: asr-service ports: - 8080:8080 environment: - SPRING_PROFILES_ACTIVEdocker - FIRE_RED_ASR_HTTP_ENDPOINThttp://host.docker.internal:8000/v1/recognize # 假设ASR服务在宿主机 - SPRING_KAFKA_BOOTSTRAP_SERVERSkafka:9092 depends_on: - kafka networks: - asr-network # Zookeeper (Kafka依赖) zookeeper: image: confluentinc/cp-zookeeper:latest environment: ZOOKEEPER_CLIENT_PORT: 2181 ZOOKEEPER_TICK_TIME: 2000 ports: - 22181:2181 networks: - asr-network # Kafka kafka: image: confluentinc/cp-kafka:latest depends_on: - zookeeper ports: - 9092:9092 environment: KAFKA_BROKER_ID: 1 KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 networks: - asr-network networks: asr-network: driver: bridge你需要一个对应docker环境的配置文件application-docker.yml。# application-docker.yml fire-red: asr: http-endpoint: ${FIRE_RED_ASR_HTTP_ENDPOINT:http://localhost:8000/v1/recognize} spring: kafka: bootstrap-servers: ${SPRING_KAFKA_BOOTSTRAP_SERVERS:localhost:9092} consumer: group-id: asr-service-docker-group现在只需要几条命令就能让整个服务栈跑起来# 1. 打包项目 mvn clean package -DskipTests # 2. 构建Docker镜像 docker-compose build # 3. 启动所有服务 docker-compose up -d你的语音识别微服务就会在http://localhost:8080上运行并连接到Kafka等待处理任务了。获取更多AI镜像想探索更多AI镜像和应用场景访问 CSDN星图镜像广场提供丰富的预置镜像覆盖大模型推理、图像生成、视频生成、模型微调等多个领域支持一键部署。
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2450013.html
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!