研究论文

基于Fermi架构的超声图像自动增益补偿并行算法

  • 何兴无;张霞
展开
  • 1. 成都师范学院网络与信息管理中心,成都 611130;2. 成都农业科技职业学院电子信息分院,成都 611130

收稿日期: 2012-05-30

  修回日期: 2012-07-05

  网络出版日期: 2012-10-28

A Parallel Algorithm of Automatic Time Gain Compensation for Ultrasound Imaging Based on Fermi Architecture

  • HE Xingwu;ZHANG Xia
Expand
  • 1. Network & Information Management Center, Chengdu Normal University, Chengdu 611130, China;2. Department of Electronics and Information, Chengdu Vocational College of Agricultural Science and Technology, Chengdu 611130, China

Received date: 2012-05-30

  Revised date: 2012-07-05

  Online published: 2012-10-28

摘要

在医学超声成像系统中由于超声波在人体组织内传播会发生衰减,需要对超声图像进行有效的增益补偿,使超声图像的显示效果更好。但大多数自动增益补偿算法在处理时涉及大量的复杂计算,成为临床实时成像系统中的一大性能提升瓶颈,为此提出了一种基于高性能并行计算平台Fermi架构图形处理单元(GPU)的自动增益补偿并行处理算法。本算法主要的处理流程有数据预处理、区域类型检测、组织强度计算、二次曲面拟合以及自适应增益补偿等部分,核心的并行算法设计包括了粗粒度的并行均值滤波、局部方差系数的并行计算、优化的矩阵转置并行实现以及基于LU分解的粗粒度的矩阵求逆的并行实现等方面。数据测试结果显示,与基于CPU的实现相比,采用Fermi架构的GPU处理不仅可以得到完全一致和较好的增益补偿效果,而且可以取得较大的加速效果,满足实时系统需求,对512×261的图像数据能够达到427帧/s的高帧率,速度提高了大约267倍。

本文引用格式

何兴无;张霞 . 基于Fermi架构的超声图像自动增益补偿并行算法[J]. 科技导报, 2012 , 30(31) : 61 -65 . DOI: 10.3981/j.issn.1000-7857.2012.31.009

Abstract

Due to the acoustic attenuation in the human body, an efficient gain compensation on the ultrasound image is necessary for a better imaging quality in a medical ultrasound imaging system. The traditional manual adjustment method suffers some drawbacks, such as the difficulties in adjusting a special region, so it is very important to implement the automatic time gain compensation (ATGC) algorithm in the clinical ultrasound imaging system. Because of the massive computation involved in this ATGC technique, this problem becomes the bottleneck for a clinical real-time imaging system. In this paper, a new parallel algorithm of ATGC based on Fermi GPU (graphics processing unit) is presented. The main procedures of this algorithm include the pre-processing, the speckle detection, the tissue intensity computation, the 2-D surface fitting and the adaptive gain compensation. The key parallel algorithm includes a parallel box filter with coarse-grained, parallel local variance coefficient computation, the optimized parallel matrix transposition, the parallel matrix inversion based on the LU factorization in a coarse-grained parallel way. Test results not only show that the output of the graphics processing unit (GPU) is definitely the same as that of the CPU, but also demonstrate an obvious speedup by using the GPU, that is, with 427 frames per seconol for the image size (512×261) , 267 times faster than the CPU implementation.
文章导航

/