Duplo

C/C++/Java Duplicate Source Code Block Finder

Contents


1 General Information

Duplicated source code blocks can harm maintainability of software systems.
Duplo is a tool to find duplicated code blocks in large C/C++/Java systems.

1.1 Sample output

...
src\engine\geometry\simple\TorusGeometry.cpp(56)
src\engine\geometry\simple\SphereGeometry.cpp(54)
    pBuffer[currentIndex*size+3]=(i+1)/(float)subdsU;
    pBuffer[currentIndex*size+4]=j/(float)subdsV;
    currentIndex++;
    pPrimitiveBuffer->unlock();

src\engine\geometry\subds\SubDsGeometry.cpp(37)
src\engine\geometry\SkinnedMeshGeometry.cpp(45)
    pBuffer[i*size+0]=m_ct[0]->m_pColors[i*3];
    pBuffer[i*size+1]=m_ct[0]->m_pColors[i*3+1];
    pBuffer[i*size+2]=m_ct[0]->m_pColors[i*3+2];
...

1.2 Usage

NAME
        Duplo - duplicate source code block finder


SYNOPSIS
        duplo [OPTIONS] [INTPUT_FILELIST_FILE] [OUTPUT_FILE]

DESCRIPTION
        Duplo is a tool to find duplicated code blocks in large
        C/C++/Java software systems.

        -ml minimal block size in lines (default is 4)
        -mc minimal characters in line (default is 3)
            lines with less characters are ignored
        -ip ignore preprocessor directives
        INTPUT_FILELIST_FILE file with list of source files
        OUTPUT_FILE output file

1.3 Feedback and Bug Reporting

1.4 Source files text file generation

Generate a text file with a list of all files in a directory with:

Windows
 dir /s /b /a-d *.cpp *.h > files.txt

UNIX
 find -name "*.cpp" > cppList.txt
 find -name "*.h" > includeList.txt
 cat cppList.txt includeList.txt > files.txt

and start duplo with: duplo files.txt results.txt

2 Download

Download Duplo
here.

3 Performance Measurements

System   Files   Locs   Time   Hardware
3D Game Engine   275   12211   4sec   3.4GHZ P4
Quake2   266   102740   58sec   3.4GHZ P4
Computer Game   5639   754320   34min   3.4GHZ P4
Linux Kernel 2.6.11.10   17034   4184356   16h   3.4GHZ P4

4 Background

Duplo uses the techniques described in the paper
A Language Independent Approach for Detecting Duplicated Code
from Stéphane Ducasse, Matthias Rieger and Serge Demeyer to detect duplicated code blocks.

5 License

Duplo is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

Foobar is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with Duplo; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA

SourceForge