Transcriptome analysis of ageing in uninjured human Achilles tendon

Introduction The risk of tendon injury and disease increases significantly with increasing age. The aim of the study was to characterise transcriptional changes in human Achilles tendon during the ageing process in order to identify molecular signatures that might contribute to age-related degeneration. Methods RNA for gene expression analysis using RNA-Seq and quantitative real-time polymerase chain reaction analysis was isolated from young and old macroscopically normal human Achilles tendon. RNA sequence libraries were prepared following ribosomal RNA depletion, and sequencing was undertaken by using the Illumina HiSeq 2000 platform. Expression levels among genes were compared by using fragments per kilobase of exon per million fragments mapped. Differentially expressed genes were defined by using Benjamini-Hochberg false discovery rate approach (P <0.05, expression ratios 1.4 log2 fold change). Alternative splicing of exon variants were also examined by using Cufflinks. The functional significance of genes that showed differential expression between young and old tendon was determined by using ingenuity pathway analysis. Results In total, the expression of 325 transcribed elements, including protein-coding transcripts and non-coding transcripts (small non-coding RNAs, pseudogenes, long non-coding RNAs and a single microRNA), was significantly different in old compared with young tendon (±1.4 log2 fold change, P <0.05). Of these, 191 were at higher levels in older tendon and 134 were at lower levels in older tendon. The top networks for genes differentially expressed with tendon age were from cellular function, cellular growth, and cellular cycling pathways. Notable differential transcriptome changes were also observed in alternative splicing patterns. Several of the top gene ontology terms identified in downregulated isoforms in old tendon related to collagen and post-translational modification of collagen. Conclusions This study demonstrates dynamic alterations in RNA with age at numerous genomic levels, indicating changes in the regulation of transcriptional networks. The results suggest that ageing is not primarily associated with loss of ability to synthesise matrix proteins and matrix-degrading enzymes. In addition, we have identified non-coding RNA genes and differentially expressed transcript isoforms of known matrix components with ageing which require further investigation. Electronic supplementary material The online version of this article (doi:10.1186/s13075-015-0544-2) contains supplementary material, which is available to authorized users.


Introduction
The increasing number of people reaching old age provides huge challenges to society, as whereas life span increases, life quality faced by many individuals in old age is poor [1]. Whereas muscle, bone, and joint age-related disease is well recognised, the fibrous connecting tendon tissue has received little attention, despite representing a very common site of pain and dysfunction. Epidemiological studies have revealed a clear link between age and increasing incidence of tendon injury [2,3], suggesting that the mechanical integrity of tendon declines during the ageing process.
Although it is generally accepted that a degenerative process precedes gross tendon injury, the aetiology of this process remains elusive and the definition of degeneration is poorly defined. Histological examination of painful Achilles tendon [4], dysfunctional posterior tibialis [5], and supraspinatus tendon collected from cadavers [6] has revealed pathological changes, including signs of collagen fibre disruption, increased staining for glycosaminoglycan, hypercellularity, and cell shape change to a more chondroid appearance. Similar changes have been observed in macroscopically abnormal equine flexor tendon [7], another common site of age-associated tendon injury. Histological abnormalities are more often observed in older individuals [6], although the relationship with ageing and the apparent change in cell function is not clear.
Ageing is generally associated with a decline in protein synthesis [8] and a loss of cell functionality [9]. It has been suggested that early degenerative changes in tendon result from an accumulation of micro-damage within the extracellular matrix (ECM) due to an imbalance between anabolic and catabolic pathways [10]. Recent work on equine flexor tendon identified an accumulation of partially degraded collagen within the ECM of old tendons, and it was hypothesised that an inability to remove partially degraded collagen may account for reduced mechanical competency [11]. Another study found that flexor tendon explants from older horses were more susceptible to fatigue damage following cyclical loading in vitro than explants from young horses and that this was a cellmediated process involving the matrix metallo-proteinases (MMPs) [12].
Cell ageing has been associated with a decreased ability to modulate inflammation resulting in a chronic low-level inflammation termed 'inflamm-aging' [13]. Recent work by Dakin and colleagues [14] measured prostaglandin E2 in injured equine flexor tendons and found that levels increased with increasing horse age but that levels of formyl peptide receptor 2/ALX, a receptor responsible for suppressing the inflammatory response, were significantly reduced. These findings intimate that aged individuals exhibit a reduced capacity to resolve inflammation and that ageing may contribute to deregulated tendon repair through these pathways.
Quantitative analysis of gene expression changes with age may help the understanding of ageing mechanisms and their interactions with age-related diseases such as tendinopathies [15]. Although microarray technology has been employed to investigate gene expression changes following tendon injury [16], in tendinopathic tissue [17], in response to cyclical strain [18] or a single loading event [19], and effect of loading on tendon healing [20], no comprehensive analysis of alterations in gene expression with age has been undertaken in tendon.
RNA-Seq can capture the whole transcriptome, including coding RNAs, isoforms produced by alternative splicing, long non-coding RNAs (lncRNAs) (the importance of which is becoming apparent in disease [21] and ageing [22][23][24]), and short non-coding RNAs. We have previously used RNA-Seq successfully on equine cartilage tissue and identified an over-representation of genes with reduced expression relating to ECM, degradative proteases, matrix synthetic enzymes, cytokines, and growth factors in ageing cartilage [24].
In this study, we used RNA-Seq to comprehensively identify the human Achilles tendon transcriptome for the first time and then examine changes that occur with ageing. We hypothesised that ageing results in reduced expression of ECM-related proteins and matrix-degrading enzymes. In addition, we sought to identify previously unrecognised slice variants and non-coding RNAs associated with tendon ageing in a 'bottom-up' inductive approach.

Sample collection and preparation
All human Achilles tendons used in this study-RNA-Seq and quantitative real-time polymerase chain reaction (qRT-PCR)-were harvested from limbs amputated during surgical procedures to treat sarcomas at the Royal National Orthopaedic Hospital, Stanmore. Tissue collection was carried out through the Stanmore Musculoskeletal Bio-Bank, which has ethical approval from the Cambridgeshire 1 Research Ethics Committee (REC reference 09/H0304/ 78) to collect tissue for research into musculoskeletal conditions. All patients gave consent for their tissue to be used for musculoskeleton-related research. Local research-anddevelopment approval for this project was given by the UCL/UCLH/RF Joint Research Office (reference number 11/0464). For RNA-Seq, tendons were collected from donors who were 69.4 ± 7.3 years old (old group, n = 5, 3 female, 2 male) and donors who were 19 ± 5.8 years old (young group, n = 4, 4 male). Tendon tissue was collected within 24 hours of limb removal, except for one sample in which tissue was collected within 48 hours. The Achilles tendon was dissected free from the limb. Only tendons with a normal macroscopic appearance were used for this study. A section of tissue approximately 1 cm in length was taken from the mid region of the tendon between the musculotendinous junction and the insertion site. Outer tissue (paratenon) was removed and the remaining tendon tissue placed into RNAlater (Ambion, Warrington, UK) in accordance with the instructions of the manufacturer.

RNA extraction
Tendon was pulverising into a powder with a dismembranator (Mikro-S; Sartorius, Melsungen, Germany) following freezing in liquid nitrogen. Immediately, 20 volumes of Tri Reagent (Ambion) was added to the powdered tendon tissue and the RNA extracted and purified as described by Peffers et al. [25] (2013). RNA was quantified by using a Nanodrop ND-100 spectrophotometer (Labtech, Uckfield, East Sussex, UK) and assessed for purity by UV absorbance measurements at 260 and 280 nm.

RNA-Seq analysis: cDNA library preparation and sequencing
Total RNA was analysed by the Centre for Genomic Research, University of Liverpool, for RNA-Seq library preparation and sequencing by using the Illumina HiSeq 2000 platform. Total RNA integrity was confirmed by using an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Ribosomal RNA (rRNA) was depleted from 9 total RNA samples by using the Ribo-Zero™ rRNA Removal Kit (Human/Mouse/Rat; Epicentre, Madison, WI, USA) in accordance with the instructions of the manufacturer. cDNA libraries were prepared with the ScriptSeq v2 RNA-Seq library preparation kit (Epicentre) by using 50 ng rRNA depleted RNA as starting material in accordance with manufacturer protocols as previously described [24]. The final pooled library was diluted to 8 pmol before hybridisation. The dilute library (120 μL) was hybridised on one lane of the HiSeq 2000 at 2 × 100-base pair (bp) paired-end sequencing with v3 chemistry.

Data processing
The sequence libraries for each sample were processed by using CASAVA version 1.8.2 to produce 100-bp paired-end sequence data in fastq format. The fastq files were processed by using Cutadapt version 1.2.1 [26] with option '-O 3' to trim adapter from any read if it matched the adapter sequence for 3 bp or more at the 3′ end. In addition, a quality trimming was performed by using Sickle version 1.200.
The trimmed R1-R2 read pairs, for each sample, were aligned to reference sequence [27] by using TopHat2 version 2.0.10 [28] with default settings, except for the option -g 1. Read counts were obtained from the mapping results by using HTSeq-count and genome annotation [29].
The differential gene expression analysis was performed on R platform by using the edgeR package [30] and focused on the contrast of old and young donors. The count data were normalised across libraries by using trimmed mean M (TMM) values of the default methods edgeR. The tagwise dispersions were estimated and then used for logFC (log 2 fold change) estimating and testing. Differentially expressed genes (DEGs) were extracted by applying the threshold false discovery rate (FDR) of less than 0.05 to adjusted P values, which were generated by using Benjamini and Hochberg approach [31]. In addition, FPKM (fragments per kilobase of exon per million fragments mapped) values were converted from count values for comparing expression levels among genes. All sequence data produced in this study have been submitted to National Center for Biotechnology Information Gene Expression Omnibus (NCBI GEO) under Array Express accession number E-MTAB-2449.

Analysis of splice variants
Trimmed paired reads were aligned to a reference human transcriptome (Ensembl iGenomes build GRCh37) by using Bowtie2 [32]. The alignments (BAM files) were converted into sorted SAM files by using SAMtools [33]. Parameters for TopHat were estimated by using a Picard tool (CollectInsertSizeMetrics.jar) [34]. Reads were aligned to the reference genome (Ensembl build GRCh37) by using TopHat [35], specifying mate inner distance (mean inner distance between mate pairs) and standard deviation for each sample. Mapped reads were then assembled into complete transcripts by using the splice junction mapping tool Cufflinks [36] with option -G, which uses the Ensembl reference gene track to improve mapping. Cuffmerge was used to merge the assembled transcripts into a consensus gene track from the all of the mapped samples. Cuffdiff was used to identify DEGs and differentially expressed transcripts between young and old tendon. Genes and transcripts were identified as being significantly differentially expressed with q values of less than 0.05, calculated by the Benjamini and Hochberg FDR correction [31].
Downstream analysis and visualisation of results, including quality control of the samples, was undertaken by using the cummeRbund package in R. Graphs were generated by using cummeRbund and the ggplot2 package [37].

Functional analysis
To systematically determine networks, functional analyses, and canonical pathways that the DEGs might involve, we performed the pathway/network enrichment analysis using the ingenuity pathway analysis (IPA) tool from the Ingenuity Systems [38] by using a list of DEGs with values-adjusted P value of less than 0.05 and ±1.4 log 2 fold regulation. Gene symbols were used as identifiers and the Ingenuity Knowledge Base gene was used as a reference for a pathway analysis. For network generation, a data set containing gene identifiers and corresponding expression values was uploaded. Default settings were used to identify molecules whose expression was significantly differentially regulated. These molecules were overlaid onto a global molecular network contained in the Ingenuity Knowledge Base. Networks of 'network-eligible molecules' were then algorithmically generated based on their connectivity. The functional analysis identified the biological functions and diseases that were most significant to the data set. Right-tailed Fisher's exact test was used to calculate P values. Canonical pathways analysis identified the pathways from the IPA library that were most significant to the data set.
For isoform analysis, the Database for Annotation, Visualization and Integrated Discovery (DAVID) (DA-VID bioinformatics resources 6.7) was used [39]. The web-based functional annotation tool enabled functional clustering of gene. The functional clustering tool was used for functional enrichment for DEG isoforms with values-adjusted P value of less than 0.05 and ±1.4 log 2 fold regulation.

Real-time polymerase chain reaction
Samples of RNA from both the same pools used for the RNA-Seq analysis and an additional independent cohort harvested in the same manner (n = 4 young; 16.7 ± 2.8 years old and n = 4 old; 73.2 ± 6.5 years old) were used for qRT-PCR. To validate results from differentially expressed isoforms, the independent cohort was used. Moloney murine leukaemia virus (M-MLV) reverse transcriptase and random hexamer oligonucleotides (both from Promega, Southampton, UK) were used to synthesize cDNA from 1 μg RNA in a 25 μL reaction. PCR was performed on 1 μL 10× diluted cDNA by employing a final concentration of 300 nM of each primer in 20 μL reaction volumes on an ABI 7700 Sequence Detector using Primer-Design 2X PrecisionTM SYBR Green Mastermix (Primer Design, Southampton, UK). qRT-PCR was undertaken by using gene-specific primers (for protein-coding genes these were exon-spanning). Primers used had been validated in previous publications [40,41] and supplied by Eurogentec (Seraing, Belgium) or were designed and validated commercially (Primer Design). Steady-state transcript abundance of potential endogenous control genes was measured in the RNA-Seq data. Assays for four genes-glucose-6-phosphate isomerise (GPI), betaactin (ACTB), ribosomal protein 13 (RSP13), and ribosomal protein 16 (RPS16)-were selected as potential reference genes as their expression was unaltered. Stability of this panel of genes was assessed by applying a gene stability algorithm [42]. RSP16 was selected as the most stable endogenous control gene. Relative expression levels were normalised to RPS16 and calculated by using the 2 −ΔCt method [43]. Primers pairs used in this study are listed (Table 1). qRT-PCR analysis data was log 10transformed to ensure normal distribution and then analysed by using Student's t test.

Statistical analysis
The analyses were undertaken by using edgeR [30]

Overview of RNA-Seq data
An average of 32.1 million pairs of 100-bp paired-end reads per sample were generated that aligned to the reference sequence of the human genome. Using pooled R1 and R2 files for all samples in Trimmed data gave 95.1% of called bases with of Phred score of more than 30 [44]. (See Table 2 for summary of mapping results.) Of the 63,152 human genes, between 40.5% and 47.4% had at least one read aligned; 20,322 of the genes had no reads aligned from any of the nine samples. This is similar to the output of other RNA-Seq sequencing studies [24,45].
These reads were used to estimate transcript expression of all nine samples using FPKM in order to identify the most abundant genes in tendon. Table 3 demonstrates the 25 most highly expressed genes in young and old tendon (the entire data set is in Additional file 1).

Identification of differentially expressed genes and isoforms
A principal component analysis (PCA) plot of log 2 gene expression data indicated that the effect of age on gene expression was distinct as data were clustered in two groups ( Figure 1A). Within the young group, two samples clustered together and two were independent of each other indicating more variability between young donors. Alterations in gene expression between young and old tendon demonstrated significant age-related changes. In total, the expression of 325 transcribed elements, including proteincoding transcripts and non-coding transcripts, small noncoding RNAs (snoRNAs), pseudogenes, lncRNAs, and a single microRNA, was significantly different in old compared with young tendon (±1.4 log 2 fold change, FDRadjusted P value of less than 0.05) ( Figure 1B). Of these, 191 were at higher levels in the older tendon and 134 were at lower levels in the older tendon. The top 10 genes most DEG (increased and decreased) during tendon ageing are given in Table 4. The entire list of significantly DEG transcripts is presented in Additional file 2. NCBI GEO under accession number E-MTAB-2449 contains a complete list of all genes mapped. Of the 191 transcripts expressed at a higher level in old donors, 148 were known protein-coding genes. The remaining 43 genes contained 34 lncRNAs, one snoRNA, and eight pseudogenes (Table 5). Within the group where gene expression was lower in old compared with young tendon, 112 were known protein-coding genes. The remaining 22 genes contained 16 lncRNAs, one snoRNA, four pseudogenes, and a single microRNA (miRNA) ( Table 6). Thus, 325 genes were input into IPA for downstream analysis, and 273 of these were mapped.
The analysis identified a number of transcript isoforms expressed in tendon, some of which were differentially expressed between young and old groups of tendon ( Figure 2). In total, 183,660 isoforms were detected in young and 191,673 isoforms were detected in old tendon. Among these, 21,193 isoforms were detected only in young and 29,206 isoforms only in old. Sixty-three known isoforms were upregulated in old tendon, with 80 downregulated with an FDR-adjusted P value of less than 0.05 and ±1.4 log 2 fold regulation. The top 10 up-and down-regulated isoforms are presented in Table 7. The entire list of significantly DEG isoforms is presented in Additional file 3.

Differentially expressed genes and network analysis
DEGs (325) and differentially expressed transcript isoforms (143) associated with ageing were analysed together in IPA with the following criteria: P value of less than 0.05 and 1.4 log 2 fold change. Network-eligible molecules were overlaid onto molecular networks based on information from the ingenuity pathway knowledge database and networks generated based on connectivity.
(See Additional file 4 for all identified networks and their respective molecules.) The top four scoring networks for genes differentially expressed with tendon age were from cellular function and maintenance, cellular growth and proliferation, cellular cycling, and cellular development ( Figure 3). Significant IPA canonical pathways are demonstrated in Table 8, and the associated molecules of the top canonical pathways identified are in Additional file 5. These include hepatic fibrosis, oestrogen biosynthesis, and transcriptional regulatory networks in embryonic stem cells. Interestingly, skeletal and muscular disorders were identified as one of the top diseases associated with the gene set (Additional file 6).

Functional annotation of up-and down-regulated isoforms
There was a reduction in the DEG isoforms of 32 genes (representing 15% of the data set) relating to the ECM, degradative proteases, cytokines, and growth factors in tendon derived from older donors compared with young donors. In comparison, there was an increase in only two ECM genes (representing 1.3% of the data set) in older donors (data not shown). DAVID identified significant gene ontology (GO) terms in the upregulated and downregulated set of transcript isoforms (Table 9) with only two terms 'secreted' and 'signal' overlapping between the two groups. Interestingly, other terms are strikingly different between the upregulated and downregulated isoform data sets. Several of the top GO terms identified in downregulated isoforms in old tendon relate to collagen and post-translational modification of collagen (for example, hydroxylation, hydroxylysine, hydroxyproline, and triple helix).

Confirmation of DEG by using qRT-PCR measurements of selected genes
To validate the RNA-Seq technology, selected gene expression differences noted in the RNA-Seq analysis were re-measured by using reverse transcription and qRT-PCR. This was performed on the original RNA from all donors used to perform the RNA-Seq experiment (Table 10) and an independent cohort (Additional file 7A). All genes were found to have comparable results with RNA-Seq data; for instance, genes identified as having an increase in expression in older samples in the RNA-Seq experiment also gave increased expression relative to RPS16 following qRT-PCR. Statistical significance was tested by using Student's t test. Two genes whose expressions were not significantly altered in RNA-Seq results-aggrecan (ACAN) and MMP3-were also unaltered when assessed with qRT-PCR. Gene expression analysis using qRT-PCR of an independent cohort found similar results. Validation of differential isoform expression by using qRT-PCR was in general concordance with RNA-Seq (Additional file 7B). In all cases, the level of expression varied between the two platforms.

Discussion
Ageing is recognised as a significant risk factor for tendon injury; however, knowledge of changes to the transcriptome of tendon cells has previously been limited to that gained from quantitative PCR [5,48,49] and microarray studies on tendinopathic human [50,51] and rat tissue [17,52]. In this study, we report for the first time the use of the RNA-Seq technique to undertake deep transcriptome profiling of young and old macroscopically normal human Achilles tendon. Importantly, validation studies using qRT-PCR demonstrated high correlation between methodologies and demonstrated reproducibility using a different donor set. One of the many advantages of RNA-Seq over microarrays is that it enables de novo analysis of transcripts, including novel transcripts. In this study, we were able to identify and quantify protein-coding transcripts, alternatively spliced isoforms, lncRNAs, pseudogenes, and small regulatory RNAs, including small nucleolar RNAs (snoRNA) and an miRNA. The age of the donor accounted for most of the variability in the data, although PCA identified more variability between young donors. We did not have access to detailed medical history and lifestyle factors for the patients in this study, so we are unable to determine whether other factors explain the variability more precisely.
Tendon is characterised by a large amount of ECM interspersed around a relatively sparse population of cells. The main component of the matrix is the fibril-forming type I collagen, which composes about 70% of the dry weight of the matrix [53]. Minor collagen types include other fibril-forming collagens, type III and V; fibrilassociated collagens, type XII and XIV; and type VI collagen. As expected, these were the main collagen genes we identified in the transcriptome of the Achilles tendon tissue, albeit at relatively low levels. The non-collagenous component of tendon is rich in small leucine-rich proteoglycans (SLRPs), including decorin, biglycan, fibromodulin, and lumican, and the glycoproteins COMP, lubricin, tenomodulin, and tenascin C [54]. Interestingly, the results of this study show that decorin was by far the most highly expressed ECM gene across the samples in comparison with relatively low levels of collagen transcripts. Lumican was the next most highly expressed ECM protein followed by fibromodulin and COMP. These results are in line with our recent proteomics study in which decorin was the

IGFBP6
Insulin-like growth factor-binding protein 6 0.7 0.6 859.0 The table demonstrates the 25 most highly expressed genes in young and old tendon in terms of transcript expression as determined by using fragments per kilobase of exon per million fragments mapped (FPKM). FDR, false discovery rate. second most abundant ECM protein in a guanidine soluble extract of equine flexor tendon [55]. Degradation of the ECM is accomplished by a family of MMPs along with other proteases, and we identified expression of collagenases, stromelysins, gelatinases, and aggrecanases, although in general the levels of expression were low. An exception to this was MMP3, a stromelysin responsible for proteoglycan degradation, which was one of the most abundant transcripts, again supporting the finding of a higher turnover of non-collagenous proteins.
Ageing results in changes to the tendon ECM composition, although these are poorly defined at present and the impact on tendon mechanical properties is not clear as some studies report increased stiffness with ageing [56,57] whereas others report a decrease [58,59]. A recent study using equine flexor tendon found that, although the mechanical properties of the gross structure and the component fascicles did not change with age, the inter-fascicular matrix became stiffer. Given this finding, we expected to find differential expression of ECM transcripts, particularly those enriched in the inter-fascicular matrix. The differential gene expression analysis showed no regulation of proteins likely to be enriched in the inter-fascicular matrix or inter-fibrillar proteins [54]. The alpha 1 chain of type I collagen and alpha 1 chain of type III collagen were identified as having reduced expression in the old age group, although this lost statistical significance when measured by qRT-PCR on a larger sample set. For the most part, these data do not support our original hypothesis that tendon ageing results in reduced expression of genes relating to ECM, degradative proteases, cytokines, and growth factors, unlike changes evident in ageing cartilage [24].
Tendon disease, which has a clear association with ageing, has been the focus of several gene expression studies. Generally, findings in these studies are in Table 4 Top 10 genes with the highest and lowest log 2 fold change when comparing young and old tendon Log 2 fold change and q value (adjusted P value) were determined in edgeR. A logarithm to the base 2 of 9 is approximately a linear fold change of 3.2. Shown are the 10 genes with highest and lowest expression in old compared with young tendon samples. keeping with the hypothesis of increased matrix turnover, with an imbalance favoring catabolism. For example, various studies have demonstrated increased expression of collagen 1 alpha 1 (COL1A1) [5,48,49] and proteins more typical of cartilage COL2A1, aggrecan, and SOX9 [5,52]. Tendinopathic samples show an upregulation of various MMPs, including MMP23 [5,51], a disintegrin and metalloproteinase 12 (ADAM12) [5,50,51], and downregulation of MMP3. The results of our study are in stark contrast to this with very low expression levels for COL2A1, aggrecan, SOX9, most MMPs (except MMP3), and a significant downregulation of ADAM12 in the old group. Therefore, the results suggest that degeneration is not an inevitable consequence of ageing and that ageing and disease-associated degeneration are distinct processes.
In this study, we identified DEG gene sets with ageing related to a dysregulation of cellular function and maintenance, cellular growth and proliferation, cellular cycling, and cellular development. Therefore, these changes suggest that the cellular component of tendon may lose the ability to respond appropriately to mechanical and chemical signals. Other studies have linked cellular senescence, a state of irreversible growth arrest, in a small subset of cells (progenitor cells) in tendon with tendon ageing [60]. A senescence phenotype has been described, although no marker of senescence identified thus far is entirely specific to the senescent state [9]. Most senescent cells express p16(INK)4a, which is not commonly expressed by quiescent or terminally differentiated cells [9]. In this study, p16(INK)4a was expressed at higher levels in the old group, although transcript levels overall

DNA replication and cell proliferation
Terms are derived from Ensemble [46] and Vega [47]. 'Antisense' overlaps the genomic span of a protein-coding locus on the opposite strand. 'Known' indicates identical to known cDNA or proteins from the same species and has an entry in a model database. 'Novel' indicates identical or homologous to cDNAs from the same species or proteins from all species. 'Processed transcript' does not contain open reading frame and cannot be placed in any other category. 'Pseudogene' indicates homology to protein but from a disrupted coding sequence and an active homologous gene can be found at another locus. 'Sense intronic' has a long non-coding transcript in introns of a coding gene that does not overlap any exons. LncRNA, long non-coding RNA (which can be further classified as LINCRNA, which is a long interergenic non-coding RNA locus of more than 200 base pairs); miRNA, microRNA; SnoRNA, small non-coding RNA. were low, which may indicate that a small subpopulation of cells is responsible for the difference. Senescent cells have been shown to contribute to an inflammatory profile, and the term 'inflamm-aging' has been coined [13]. Studies have shown upregulation of inflammatory mediators such as cytochrome oxidase 2 (COX2), interleukin 6 (IL-6), and prostaglandin E2 (PGE2) and downregulation of the lipoxin A4 (LXA4) receptor FPR2 (formyl peptide receptor 2)/ALX in human or equine tendinopathic tissue [14,49]. Inflammatory pathways, however, were not recognised in our GO mapping of DEGs between young and old groups in this study. An interesting finding in this study was the differential expression of isoforms and those with reduced expression in the older tendons mapping to ECM, degradative proteases, cytokines, and growth factors. AS is a significant regulatory mechanism in gene expression as it enables versatility at the post-transcriptional level accounting for proteome complexity and may affect up to 92% of human genes [61]. Differences between isoforms of the same protein extend from a complete loss of function, acquiring a new function to subtle modulations, the latter observed in the majority of cases [62]. Few AS events have been reported in tendon to date. Those that have include versican, in which AS may contribute to changes in ECM structure and function in tendinopathies [48]; lubricin, which is location-dependent [63]; and insulin-like growth factor 1, which is mechanical stress-dependent [64]. The isoforms showing the greatest difference between young and old tendon groups in our study (for example, COL1A1, COL3A1, and ADAM12) are recognised as some of the most important proteins for tendon function and the relevance of these isoforms requires further investigation.
The results of our study have yielded new information relating to tendon cell phenotype and to the ageing process, identifying transcripts that are not generally recognised as being important in tendon. For example, the gene most highly expressed, disregarding ribosomal proteins, was angiopoetin-like 7 (ANGPTL7). This protein has previously been identified as highly expressed in microarray analysis of human tendinopathic tissue from various tendons [50]. Angiopoetins are involved in angiogenesis [65], inflammation, and glucose [66] and lipid [67] metabolism. In the cornea, ANGPTL7 may function as negative regulator of angiogenesis, contributing to the avascular properties of the tissue [68], whereas in the human ocular trabecular meshwork cells, it has a role in the organization of the ECM [69]. Thus, we suggest that ANGPTL7 may have a role in maintaining the relatively avascular nature of tendon tissue and in the organisation of the ECM. This represents an area for further investigation.
One of the limitations of this study is that the samples were taken from patients with malignant disease. We consider that it is very unlikely that this has influenced the results as samples are taken only when the tumour is at a site distant to the tendon and the tendon is macroscopically normal; however, we cannot rule out the possibility that some of the genes showing high expression, such as metastasis associated lung adenocarcinoma transcript 1 (MALAT1), ANGPITL7, and S100A6, are related to the disease state.
Another point of interest was the expression of genes associated with muscle cells. For example, we observed a reduced DEG of myosin heavy chain 1 and an increase DEG of myogenic factor 5 (MYF5) and MYOCD. Our previous studies in ageing cartilage also identified DEG of muscle-related genes: myosin heavy chain 2, myosin 3A, and myosin 1B, which were all reduced in cartilage ageing [24]. Samples of Achilles tendon were taken at a region far removed from muscle insertion, and the identification of muscle genes in cartilage and tendon is unlikely to be due to inadvertent inclusion of muscle tissue.  The significance of the association between the data set and the canonical pathway was measured by using a ratio of the number of molecules from the data set that mapped to the pathway divided by the total number of molecules that map to the canonical pathway. Fisher's exact test was used to calculate P values.
Interestingly, there was an increase in the expression of a large set of transcription factors in old compared with young tendon. In Caenorhabditis elegans [70] and a number of tissues, including heart [71] and brain [72], transcription factors have been implicated in ageing. Interestingly, deacetylates Nk2 homeobox 1 (NKX2-1), a transcription factor showing upregulation in old tendon in our study, is involved in neuronal activation in dorsomedial and lateral hypothalamic nuclei, a function thought to contribute to a more 'youthful' physiology during ageing [73]. Conversely, an isoform of scleraxis (SCX), a critical transcription factor in tendon development [74], was reduced in old tendon. In addition, the reduced expression in old tendon of EGR2 and AS EGR1, both of which are required for tendon differentiation [75], may affect tendon repair [76].
We identified eight pseudogenes showing upregulation in the old group of tendons. Pseudogenes have similar sequences to their counterpart coding genes, but owing to mutation/deletion or insertion of nucleotides they cannot be transcribed. It is hypothesised that pseudogenes act as post-transcriptional regulators of the corresponding parental gene [74]. In other studies, pseudogenes have been identified as increasing with age, such as pseudogene cyclin D2 in the ovary [76], and recent work has indicated that they may have a role in inflammation [77]. This provides an exciting new frontier to explore in ageing research, and further work is required to determine whether any of the pseudogenes identified in this study have functional significance.
Our study is the first to profile lncRNAs in tendon. LncRNAs are a large and functionally heterogeneous class of RNAs with a length of more than 200 nucleotides. They have been shown to regulate mRNA transcription, splicing, stability, translation, and epigenetic modification, providing a complex spectrum of gene regulatory functions [78], and a number of studies have identified roles for lncRNAs in ageing [22,79]. In this study, lncRNAs were shown to be DEGs in ageing tendon, and 34 showed upregulation in old tendon. In musculoskeletal disease, relatively little work interrogates the role lncRNAs in tissue physiology and disease except for a few studies in cartilage/OA [80][81][82] and muscle (reviewed [83]) and an osteosarcoma study [84]. The lncRNA transcriptome signatures in ageing tendon provide an interesting set of genes for further studies to determine their role in tendon ageing and disease.

Conclusions
Our study is the first to interrogate tendon by using RNA-Seq. We demonstrate dynamic alterations in RNA with age, at numerous genomic levels, which indicate changes in the regulation of transcriptional networks. Further extensive follow-up analysis of modulator genes, splice variants, and non-coding RNAs found in this study may be useful in understanding tendon ageing. Values for real-time polymerase chain reaction (RT-PCR) are the mean ± standard deviation of relative expression levels normalised to expression of RSP16 (to two decimal places). Statistical significance was tested by using Student's t test. RT-PCR results are expressed as 2^-DCT. ACAN, aggrecan; COL1A1, collagen type 1 alpha 1; COL3A1, collahen type 3 alpha 1; EGF, epidermal growth factor; IGF1, insulin growth factor 1; LINC00957 long intergenic non-protein coding RNA 957; MMP3, matrix metalloproteinase; MMP16, matrix metalloproteinase 16; MYF5, myogenic factor 5; MYH1, myosin heavy chain 1; POU3F4, POU class 3 homeobox 4; RP11.308 N19.1, Inc-ZNF462-2; TGFB3, transforming growth factor β.