1-year internship in U.S. (NAL)

實習期間:一年 (約自2019年8月起)
Who: 2 Ms and Phd students in College of Electrical Engineering and Computer Science, NTU
Internship period: 1 year (approximately starting from Aug. 2019)
**薪資及福利Salary and benefits:
1.年薪25,000美金 25000 USD annual salary

Flight tickets between Taiwan and U.S. under maximum 4000USD per student
3.搬家津貼settling in allowance: $1,000 (USD)
4.美國健康保險補助 US health insurance

•2019年United States Department Of Agriculture美國農業部國家農業研究局(USDA-ARS)與我院合作之研究實習計畫徵選碩博士生參與!實習單位為美國農業部國家農業圖書館 (National Agricultural Library, NAL) 網頁連結請見:www.nal.usda.gov。
2019 internship at National Agricultural Library (NAL), United States Department Of Agriculture. More info about NAL at www.nal.usda.gov.
•申請審查時程Application Timeline:
1.申請截止日期:2019年1月29日 Application Deadline: Jan 29, 2019

NAL Announce acceptance: Feb 27, 2019

Acceptance deadline for students: Mar 2nd, 2019
4.簽證申請:2019年3月起 Visa application: since Mar, 2019
5.出發前往美國:2019年8月Departure to US: August in 2019

•Application documents:
1.2019NAL申請書Application form
2.Undergraduate and postgraduate official transcripts (GPA)
3.Statement of Interest
4.2 reference letters
5.Papers (journal or conference) or sample examples of programming skills
6.English Language Certificate
* English Language Certificate can be provided after NAL acceptance before visa application. If NAL consider further improvement in English proficiency necessary, the NAL-accepted student needs to complete required English courses before departure to US.

Please submit all application documents to Mrs. Dana Yiin @ room 202 in Barry Lam Hall before 17:00 on Jan 29, 2019. The college of EECS will review all applicants and make nomination to NAL.

For Internship Announcement
The National Agricultural Library of the United States Department of Agriculture (NAL; https://www.nal.usda.gov/) is now recruiting new students to an internship program on computational bioscience and scientific big data management. Master’s and PhD-level students with experience in bioinformatics and/or software development are invited to participate on projects related to the i5k Workspace@NAL, a USDA database for arthropod genomics (https://i5k.nal.usda.gov; doi: 10.1093/nar/gku983; https://github.com/NAL-i5K). The i5k Workspace is a web resource providing dissemination, visualization, and curation tools for ‘orphaned’ insect or arthropod genome projects. This is an actively growing project, and there are many possible development opportunities. Projects focus on application/tool development and biocuration services for our research community.
Desired Skills and Related Experience
Exact skills, knowledge and experience will vary based on the projects selected. The list below contains desired, but not required skills.
oBasic knowledge of genome assemblies and gene prediction
oGene prediction and functional annotation
oUnix command line (including usage of high-performance computing systems)
•Data visualization
oGoogle APIs and SDKs (Analytics, Charts, Maps)
oOther JavaScript libraries (such as jQuery, Flot/jQuery, D3.js, Processing.js, etc.)
oUser experience/ User interface
•Programming languages
oObject oriented programming (Java, Python)
oDynamic scripting languages (Perl, PHP)
oStatistical programming (R)
•Database design and programming
oEntity-relation modeling and normalization
oPerformance tuning
oData warehouses, business intelligence, and data mining
•Open source software
oLAMP (Linux/Apache/MySQL/PHP or Python, etc.) software stack
oWeb framework (Django, Google Web Toolkit)
oMiddleware (JBoss, Tomcat)
oContent management (Drupal)
oVersion control (Git, Github/GitLab)
oContinuous integration (Travis, Jenkins CI)
Potential Opportunities
Interns also have the opportunity to develop their own project ideas related to the i5k Workspace.
•Develop workflows for data processing. The i5k Workspace accepts data from many genomes, and will increase the number of genomes hosted in the future. In this internship project, the intern will work with the i5k Workspace team to develop and improve a pipeline in a workflow language (e.g. CWL). This workflow will expedite adding new content to the i5k Workspace’s applications. In the process, the intern will add new content to our database, and/or improve existing content.
•Bioinformatics application development. The i5k Workspace is exploring new services for our users. Interns will develop or refine workflows to automatically generate new data types for our genome browsers, such as mapped RNA-Seq reads, methylation data, or lateral gene transfers.
•Functional annotation generation. Interns will run and improve a pipeline for insect functional annotation generation, and integrate this content into the i5k Workspace.
•Develop tools and integrate data to support comparative analyses of arthropod genomic data. Homology data are available for many i5k species. This information can be added to gene pages, visualized in browsers, and represented in many other ways to support comparative analysis.
•Biocuration. The i5k Workspace enables manual curation of gene models by the i5k community. We are seeking an intern interested in manual curation and biocuration to: identify workflows for new annotators; create manual curation tutorials; and interact with the i5k manual curation community to identify curation needs.
•Tripal development. Interns with experience in php can contribute towards developing modules for the Tripal software (http://tripal.info/).
•Improve i5k Workspace systems, tools and approaches. Interns will review computational approaches in our existing software and develop improved algorithms. For example, interns can review our coordinate conversion workflow for genome assemblies and research improved algorithms to optimize for memory usage, compute efficiency or storage usage.
•Create tests to improve i5k Workspace system functionality and efficiency. Interns will develop or improve uptime, build, and functional tests for the genomics workspace (https://github.com/NAL-i5K/genomics-workspace).
•Improve Standard Operating Procedure documentation to reflect best practices.
Examples of past and current projects:
•Python programs for updating GFF3 coordinates to new assembly versions: (https://github.com/NAL-i5K/remap-gff3)
•Reduce unnecessary/redundant code, implement build tests, incorporate coverage, unit and functional tests (https://github.com/NAL-i5K/genomics-workspace)
•Development of a novel BLAST user-interface (https://github.com/hotdogee/django-blast)
•Development of a single-sign on system to the i5k Workspace using Django
•Programs to check the quality of the GFF3 format (https://github.com/hotdogee/gff3-py)
•Program to compare two GFF3 files (https://github.com/chienyuehlee/gff-cmp-cat)
•A ‘toolkit’ for the GFF3 annotation format, including programs to QC and merge two gff3 files for Official Gene Set generation (https://github.com/NAL-i5K/GFF3toolkit/)
•Implementation of HMMER and ClustalW web services (https://github.com/NAL-i5K/genomics-workspace)
•Application stress testing (our internal tests were incorporated into the Apollo codebase)
•A program to convert several file types between assembly coordinate systems: https://github.com/NAL-i5K/coordinates_conversion
Publications of past interns
•Chen, M.-J. M., Lin, H., Chiang, L., Childers, C.P., and Poelchau, M.F. Methods in Molecular Biology. 2019. The GFF3toolkit – QC and Merge Pipeline for Genome Annotation. Pp. 75-87; doi:10.1007/978-1-4939-8775-7_7. Interns Mei-Ju May Chen, Yu-Yu Lin and Li-Mei Chiang are co-authors.
•Poynton, H.C., et al. 2018. The Toxicogenome of Hyalella azteca : A Model for Sediment Ecotoxicology and Evolutionary Toxicology. Environmental Science & Technology 52:10; doi: 10.1021/acs.est.8b00837. Interns Mei-Ju May Chen and Yu-Yu Lin are co-authors.
•Schoville, S., et al. 2018. A model species for agricultural pest genomics: the genome of the Colorado potato beetle, Leptinotarsa decemlineata (Coleoptera: Chrysomelidae). Scientific Reports, 8; doi: 10.1038/s41598-018-20154-1. Intern Mei-Ju May Chen is a co-author.
•Poelchau, M.F., Chen, M.-J. M., Lin, Y.-Y., and Childers, C.P. 2018. Methods in Molecular Biology. Navigating the i5k Workspace@NAL – a resource for arthropod genomes. Pp. 557-577; doi:10.1007/978-1-4939-7737-6_18. Interns Mei-Ju May Chen and Yu-Yu Lin are co-authors.
•Panfilio, K.A., et al. 2017. Molecular evolutionary trends and feeding ecology diversification in the Hemiptera, anchored by the milkweed bug genome. bioRxiv 201731; doi: 10.1101/201731. Interns Mei-Ju May Chen and Chien-Yueh Lee are co-authors.
•Saha, S., et al. (2017). Improved annotation of the insect vector of Citrus greening disease: Biocuration by a diverse genomics community. Database, bax032, https://doi.org/10.1093/database/bax032. Intern Mei-Ju May Chen is a co-author.
•Benoit, J.B., et al. (2016). Unique features of a global human ectoparasite identified through sequencing of the bed bug genome. Nat Commun 7. Interns Chien-Yueh Lee and Han Lin are co-authors.
•McKenna, D.D., et al. (2016). Genome of the Asian longhorned beetle (Anoplophora glabripennis), a globally significant invasive species, reveals key functional and evolutionary innovations at the beetle–plant interface. Genome Biol. 17, 227. Interns Chien-Yueh Lee and Han Lin are co-authors.
•Poelchau, M., et. al. (2014). The i5k Workspace@NAL--enabling genomic data access, visualization and curation of arthropod genomes. Nucleic Acids Res. (43), D714-D719. Interns Chien-Yueh Lee, Han Lin and Jun-Wei Lin are co-authors.

