Skip to main content

Reproducibility Package

Scientific work only has value if it can be understood, verified, and reproduced. Your Reproducibility Package documents how you achieved your results and provides everything needed for another researcher to repeat your main findings.

This deliverable is graded jointly under Results, Analysis & Reflection and Reproducibility criteria.


Purpose

The goal of the reproducibility package is not just to archive your files — it is to make your work transparent, traceable, and reusable. Think of it as your scientific fingerprint: a minimal, well-documented bundle that anyone with the right background could rerun.

Some of you will be continuing or building upon projects from previous student groups. If you find that parts of older projects are difficult to understand or reproduce, that frustration is exactly what we aim to eliminate. Your task is to ensure that future students can build on your work without ambiguity, lost data, or missing context. Good documentation is not only a courtesy — it is what turns your project into a lasting contribution to the CACoM ecosystem.


Required Contents

Your submission must include all materials needed to interpret or reproduce your project. Exact contents vary by project type, but the structure below applies to most teams.

ComponentDescription
ReportA short (3–6 pages) write-up summarizing motivation, methods, results, and conclusions. It may extend the poster or serve as a compact paper.
CodeAll scripts, notebooks, or programs required to generate your key figures and results. Clearly labeled and runnable.
Data / data descriptionEither small data files or detailed instructions on how to obtain them (e.g., dataset name, link, preprocessing steps).
Figures & diagramsCore plots, flowcharts, or architecture diagrams that illustrate your analysis or experimental setup.
README.mdA concise guide describing what's inside the package and how to reproduce your key results.
License / creditsIndicate sources of data and code; include any licenses where required.
tip

If you used a private or sensitive dataset, describe the data access procedure and provide a small synthetic example to demonstrate your workflow.

note

Exact structure may vary — for instance, survey or review-based projects may include their questionnaire, PRISMA flow diagram, or coding scheme instead of raw data files. The key is transparency and reproducibility within your specific project type.


Data Upload Policy

It is essential to distinguish what may be uploaded publicly and what must remain private.

PlatformWhat to UploadWhat Not to Upload
GitHub (optional)- Code, documentation, and synthetic or open datasets.
- Small mock examples that illustrate the workflow.
- Figures or plots generated from anonymized data.
❌ Real clinical, patient, or proprietary datasets.
❌ Any identifiable or licensed material that you do not own the rights to share.
❌ Raw recordings from CTG, IMU, or similar sensors.
Google Drive (official CACoM submission folder)All materials required for full internal reproducibility, including the actual datasets used (provided licenses and ethics allow internal academic sharing).❌ Do not share Google Drive links publicly. This upload is strictly for course instructors and examiners.
🚨 Important

Never upload real patient or proprietary data to GitHub or any public service. If you do so, you may violate data protection laws and TUM regulations. Upload such data only to the private, course-managed Google Drive folder designated for your submission.

note

All materials must be submitted to the Google Drive folder assigned to your group number, and confirmed by email to Prof. Martin Daumer (CC Pooja N. Annaiah). The folder link will be shared after topic approval.


Good Practices

  • Use relative paths and fixed random seeds where applicable.
  • Include environment files (environment.yml, requirements.txt, or Manifest.toml).
  • Comment your code and label outputs with figure numbers that match your report/poster.
  • Store figures as static images (PNG, PDF, SVG) instead of screenshots.
  • Prefer open formats over proprietary ones (CSV > Excel, PNG > PowerPoint).
  • If using Jupyter/Pluto notebooks, ensure they run from top to bottom without manual intervention.
  • Provide clear version notes or changelogs if building upon previous groups' work.

Verification and Testing

Before submission:

  1. Delete large intermediate data files and re-run your entire pipeline from raw inputs.
  2. Verify that every figure and table in your report can be regenerated.
  3. Have one team member unfamiliar with the code follow the README instructions — if they succeed, your package is reproducible.
  4. Be prepared to walk through your entire pipeline and explain how results were produced during your poster session discussion.
note

You are not expected to produce a production-level software package.
Clarity, completeness, and honesty matter more than perfection.


  • Do not upload identifiable or patient data to public repositories.
  • Respect dataset licenses and include attributions.
  • If your project involved external collaborators or clinical data, ensure you have written permission to share materials.
  • For CTG or IMU data collected under institutional agreements, all sharing must remain internal.

Submission Format

ItemFormatNotes
Main package.zip or .tar.gzUpload to your group’s Google Drive folder
NotificationEmail to Prof. Martin Daumer, CC Pooja N. AnnaiahInclude group number and project title in the subject line
GitHub repository (optional)Public or private linkMust contain only non-sensitive code and documentation
File sizeKeep your submission as compact as reasonable — remove unnecessary intermediates and compress large files. If your data are genuinely large (e.g. multi-GB recordings), briefly note this in your README and coordinate with instructors.

Common Pitfalls

caution
  • Uploading real clinical or proprietary data to public GitHub
  • Missing README or unclear instructions
  • Using absolute file paths that break on other systems
  • Omitting data descriptions or environment files
  • Submitting code that does not run cleanly
  • Forgetting to include the report or figures referenced in your poster
  • Uploading only code without explanation or context

Quick Checklist

  • All code and figures regenerate key results
  • Data sources clearly documented or included in Google Drive submission
  • README explains exactly how to reproduce results
  • No sensitive or proprietary data uploaded to public GitHub
  • Package runs on a clean environment
  • Uploaded to Google Drive before the final deadline