2014-11-29-请允许别人比你优秀-读后感

2014-11-29-请允许别人比你优秀-读后感

刚才看到罗辑思维的一篇文章:
http://reproduced.farbox.com/post/2014-12-29-qing-yun-xu-bie-ren-bi-ni-you-xiu
主要意思就是不要打击别人梦想也不要拼命崇拜成功者。

文章说的很有道理,但是我想说的是在我看来,把别人跟你比是没有意义的。每个人的起点,天赋,努力程度等等各种可控和不可控的因素太多,而且所谓的优秀定义也很模糊,这种比较完全没有意义,其荒谬程度跟把所有的动物集中在一起比赛铁人三项赛差不多。更有意义的一种比较是把现在的自己和过去的自己比较,自己有没有进步,有没有更快乐,有没有更有钱,有没有更健康。。。

2014-11-10-暗时间摘要

最近把刘未鹏的电子书《暗时间》重新整理阅读了一遍。这里是我整理的摘要,方便自己将来阅读和思考。如果朋友们对文章内容感兴趣,推荐去读原文或者买本实体书


序言 为什么人人都该学点心理学

唯有避免了思维的谬误,才能进行正确的思考。

第一篇 暗时间

暗时间

善于利用思维时间的人,可以无形中比别人多出很多时间,从而实际意义上能比别人多活很多年。能够充分利用暗时间的人将无形中多出一大块生命,你也许会发现这样的人似乎玩得不比你少,看得不比你多,但不知怎么的就是比你走得更远。能够迅速进入专注状态,以及能够长期保持专注状态,是高效学习的两个最重要习惯。只有具备超强的抗干扰能力,才能有效地利用起前面提到的种种暗时间。

设计你自己的进度条

善于规划的人,会将目标分割成一个个的里程碑,再将里程碑分割成TODO列表。过早退出的原因往往在于对于未来的不确定性,对于投资时间最终无法收到回报的恐惧,感受到的困难越大,这种恐惧越大,因为越大的困难往往暗示着这个任务需要投资的时间越大。过早退出是一切失败的根源。兴趣遍地都是,专注和持之以恒才是真正稀缺的。生活中的选择远比我们想象得要多,细微的选择差异造就了不同的人生。靠专业技能的成功是最具可复制性的。反思是让人得以改进自己的最重要的思维品质。一生的知识积累,自学的起码占90%。

如何有效地记忆与学习

你所拥有的知识并不取决于你记得多少,而在于它们能否在恰当的时候被回忆起来。对于理解记忆的人来说,知识中包含了精细的概念、逻辑、一般的解题原则、通用的解题手法、背景知识、类似的问题等等无数的记忆和提取线索,而不是某段孤立的、任意的文本序列。缺乏线索的记忆就像记忆海洋中的孤岛,虽然在那里,但是难以访问。而富含线索的记忆则是罗马,条条大路通罗马。我们在从既有经验中总结知识的时候,应利用适当的抽象来得出适用范围更广的知识(而不仅仅是一个萝卜一个坑);另一方面,在遇到新问题的时候,同样应该对问题进行抽象,触及其本质,去除不相干因素避免干扰,从而有效提取之前抽象出来的知识。一些具体的实践方法: 1, 经常主动回顾一段时间学到的东西; 2, 创造回忆的机会(经常与别人讨论,或者讲给别人听。经常整理你的笔记。将一段时间学习的知识按照一个主题系统地“串”起来大大地丰富了知识之间的关联,平添无数提取线索。); 3, 设身处地地“虚拟经历”别人经历过的事情; 4, 抽象和推广, 将特例中得到的规律推广到一般情况; 5, 联系/比较自身的经历。

学习密度与专注力

专注力为什么会对学习效率造成这么大的影响。这来源于两个方面,一是专注于一件事情能让表层意识全功率运作,这个是显式的效率。第二点,也是更重要的,它还能够使你的潜意识进入一种专注于这件事情的状态。一个习惯于专注事情的人不管做什么事情都容易并迅速进入一种专注的状态。所谓思维体力就是能够持续集中注意力的时间,注意力造就非凡专家,天才来源于长期的专注的训练。培养你的思维体力,是成为非凡专家的一个必要条件。除了培养专注的习惯之外,还可以通过另一个充分条件来实现专注力,即做自己喜欢做的事。

一直以来伴随我的一些学习习惯

学习与思考

  1. Google&Wiki;
  2. 看书挑剔,只看经典;
  3. 做读书笔记;
  4. 利用走路和吃饭的时候思考;
  5. 多看心理学与思维的书;
  6. 学习一项知识,必须问自己三个重要问题:a. 它的本质是什么。b. 它的第一原则是什么。c. 它的知识结构是怎样的。
  7. 学习和思考的过程中常问自己的几个问题:a. 你的问题到底是什么?(提醒自己思考不要偏离问题。) b. OK,到现在为止,我到底有了什么收获呢?(提醒自己时不时去总结,整理学习的东西)。c. 设想自己正在将东西讲给别人听(有声思考;能否讲出来是判断是否真正理解的最佳办法)。d. 设想需要讲给一个不懂的人听。(迫使自己去挖掘知识背后最本质、往往也是最简单的解释)。e. 时常反省和注意自己的思维过程。尤其是当遇到无法理解或解决的问题之后,最需要将原先的思维过程回顾一遍,看看到底哪个环节被阻塞住了妨碍了理解。问题到底出在哪里。并分析以后需要加强哪方面的思维习惯,才能够不在同样或类似的时候被绊住。对此,将思维的大致脉络写下来是一个很好的习惯。f. 养成反驳自己的想法的习惯:在有一个想法的时候,习惯性地去反驳它,问自己“这个难道就一定成立吗?”、“有没有反例或例外?”、“果真如此吗?”之类的问题。g. 人的思维天生就是极易流于表面来理解事物的。觉得自己理解了一个问题了么?条件反射性地问自己:你真的理解了吗?你真的理解了问题的本质了?问题的本质到底是什么?目前我的理解是什么?我对这个理解感到满意吗?这样的理解到底有什么建设性呢?等等。

时间和效率

  1. 趁着对一件事情有热情的时候,一股脑儿把万事开头那个最难的阶段熬过去;
  2. 重要的事情优先;
  3. 重要的事情营造比较大的时间块来完成;
  4. 善于利用小块时间;
  5. 重视知识的本质;
  6. 重视积累的强大力量,万事提前准备;
  7. 时不时抬起头来审视一下自己正在做的事情,问一问它(对现在或未来)有什么价值,是不是你真正希望做的;
  8. 退订RSS;
  9. 有时间吗?总结总结最近得到的新知识吧;
  10. 有时间吗?看本书吧;
  11. 制定简要的阅读计划;

阅读方法

  1. 根据主题来查阅资料,而不是根据资料来查阅主题;
  2. 好资料,坏资料;
  3. 学习一个东西之前,首先在大脑中积累充分的“疑惑感”;
  4. 有选择地阅读;一般来说在阅读的时候应该这样来切分内容:a. 问题是什么?b. 方案是什么?c. 例子是什么?
  5. 阅读的分类:我一般把书分为两类,一类是知识的。一类是思维的。一般来说我更倾向于阅读培养思维的,因为思维方面的东西是跨学科的,任何时候都用得上。并且,反之如果思维没有培养好的话,学习东西也容易走错方向或者事倍功半。
  6. 任何一点时间都可以用于阅读;
  7. 为什么看不懂?如果看不懂一个知识,一般有如下几个可能的原因:a. 你看得不够使劲。b. 其中涉及到了你不懂的概念。c. 作者讲述的顺序不对,你接着往下看,也许看到后面就明白了前面的了。
  8. 如何在阅读之前就能获得对一本书质量的大致评估。a. 看作者。b. 看目录和简介。c. 看 Amazon 上的评价。d. 看样章。
  9. 如何搜寻到好书。a. 同作者的著作。b. Amazon 相关推荐和主题相关的书列。c. 一本好的著作在参考资料里面重点提到的其他著作。d. 有时对于一个主题,可以搜索到好心人总结的参考资源导引,那是最好不过的。

知识结构

抓住不变量。我喜欢把知识分为essential的和non-essential的。对于前者采取提前深入掌握牢靠的办法,对于后者采取待用到的时刻RTM (Read the manual)方法(用本)。为什么需要预先牢靠掌握这些essential的知识? a. 简而言之就是这些底层知识会无可避免的需要用到,既然肯定会被用到那还是预先掌握的好,否则一来用到的时候再查是来不及的,因为essential的知识也往往正是那些需要较长时间消化掌握的东西,不像Ruby的mixin或closure这种翻一下manual就能掌握的东西。b. 如果你不知道某个工具的存在,遇到问题的时候是很难想到需要使用这么样一个工具的,essential knowldge就是使用最为广泛的工具,编程当中遇到某些问题之后,如果缺乏底层知识,你甚至都不知道需要去补充哪些底层知识才能解决这个问题。c. 你必须首先熟悉你的工具,才能有效地使用它。另外还有一些我认为是essential knowledge的例子:分析问题解决问题的思维方法(这个东西很难读一两本书就掌握,需要很长时间的锻炼和反思)、判断与决策的方法(生活中需要进行判断与决策的地方远远多于我们的想象),波普尔曾经说过:All Life is Problem-Solving。学习一个小领域的时候,时时把“最终能够写出一篇漂亮的Survey”放在大脑中提醒自己,就能有助于在阅读和实践的时候有意无意地整理知识的结构、本质和重点,经过整理之后的知识理解更深刻,更不容易忘记,更容易被提取。

习惯的养成

第一条就是认识到习惯的改变绝不是一天两天的事情,承认它的难度。第二条就是如果你真想改掉习惯,就需要在过程中常常注意观察自己的行为,否则习惯会以一种你根本觉察不到的方式左右你的行为让你功亏一篑。有一个认知技巧也许可以缓解更改习惯过程中的不适:即把居住在内心的那个非理性自我当成你自己的孩子(你要去培养他),或者你的对手(你要去打败他)也行。总之不能当成自己,因为每个人都不想改变自己。

我在南大的七年

看一个人,只要看他读的书和见的人。

第二篇 思维改变生活

逃出你的肖申克

为什么我们常说很多时候一定要亲身经历了之后才能明白

  1. 切身体验; 2. 别人口中的故事; 3. 为什么; 4. 世界是复杂的; 5. 未来是不确定的; 6. 别人的道理,自己的事情; 7. 认知失调与自我辩护; 8. 失败即成功; 9. 情绪对照; 10. 天性; 11. 习惯;

亲身经历了就一定明白吗

  1. 很傻很天真的条件反射; 2. 认知偏差; 3. 情绪系统;

不需要经历也能明白——理性的力量

普通人从自己的错误中学习,聪明人从别人的错误中学习。

仁者见仁智者见智?从视觉错觉到偏见

在社会文化方面,人们常用“仁者见仁、智者见智”这个俗语来指代三种现象:
1) 偏见:不同的人戴着不同的有色眼镜,对同一现象产生不同的理解或解释。是平凡的解释还是阴谋论的解释?存乎一心。
2) 立场:例如对于“生活的意义”没有统一的标准公理,因此每种生活都是合理的,各人可以持有不同的价值观,优化不同的目标函数。
3) 选择性关注:对于同一事物,不同的人关注的点不一样,象有四腿,各摸一条。

遇见20万年前的自己

由于人的大脑是经过漫长的进化年代“堆积”起来的,也就是说,从爬行动物到哺乳动物到高级灵长类这些进化阶段,我们的大脑从只有原始的反射模块,到拥有初步的情感区域,一直到神奇的具有6层结构的“新皮质”所支撑的高级认知能力,一步步走来。世界上最痛苦的事情不是和别人作斗争,而是和自己作斗争。我们对于很多事情的决策判断都刻画在天性里面,然而同样也正是这些天性在很多时候会让我们陷入困境。经常动用理性思考也能够锻炼理性大脑的“实力”,在更多的决策场合获得压倒性优势。

理智与情感

只要我们的情绪大脑首先认定了一件事情,我们那点可怜的理性思维便很容易屈从于情绪大脑发下的命令——把事情往利于自己的方向解释。只要一件事情尚存在对自己有利的解释,我们的大脑便会毫不犹豫地掩耳盗铃地认为那就是唯一的解释。一件事情总是有两个解释:一个平凡的解释和一个疯狂的解释。而从自我辩护的角度看,一件事情总是有两种解释:一种对自己有利的解释,和一种对自己不利的解释。只要选择前者,我们便能够自欺欺人地将自己蒙混过关。

书写是为了更好地思考

书写的好处有以下几点:

  • 书写是对思维的备忘
  • 书写是对思维的缓存
  • 书写是与自己的对话
  • 书写是与别人的交流
  • 有时候,语言自己也会思考

为什么你从现在开始就应该写博客

为什么你从现在开始就应该写博客

用一句话来说就是,写一个博客有很多好处,却没有任何明显的坏处。
写一个长期的价值博客的最大的几点好处:

  1. 能够交到很多志同道合的朋友。
  2. 书写是为了更好的思考。
  3. “教”是最好的“学”。一旦你把自己潜意识里面的东西从幕后拉出来,你就有了面对并反思它们的可能,而不是任它们在幕后阴险地左右你的思维。
  4. 讨论是绝佳的反思。
  5. 激励你去持续学习和思考。
  6. 学会持之以恒地做一件事情。
  7. 一个长期的价值博客是一份很好的简历。

怎么做到长期写一个价值博客

让你自己成为一个持续学习和思考的人,并只写你真正思考和总结之后的产物,其他一切就会随之而来。

可能出现的问题以及怎样应付

1) 担心别人认为没有价值。事实是,你面临过的问题总会有人面临过,你独立思考了,别人没有,你的文章对他们就会有价值。2) 担心想法太幼稚或有漏洞等等被别人笑话。人非圣贤。正是因为单个人的想法总是有漏洞,才值得拿出来交流,被别人指出问题正是改进的空间,藏着掖着的想法永远不可能变得更成熟。3) 得不到激励。这其实是个最无聊的问题了,只有写碎碎念的博客才会面对“激励”的问题。如果写自己的总结,写自己独立的思考,那么书写下来、理解通透,本身就是一个极大的激励。4) 写不出来。这个问题也比较无聊,思考本不是一件急于求成的事情。如果你习惯了思考问题,就总会有东西写,先有思考,然后有总结,然后在总结中进一步思考。

我不想与我不能

事情开始往往是这样的:你发现自己想做某事,但你同时又迅速发现,自己并不擅长做这件事或做不了这件事。于是“我想做某事”这个念头被打败并暂时搁置起来——要不怎么办呢?你反正又不擅长这件事。一段时间过后,我问你,你想做某事吗?你回答说想,但随后又加了一句,可是做不来。就这样在“想做”与“不能做”之间痛苦徘徊了一阵子之后,我又问你,你想做某事吗?你的回答变成了,不想。你内心发生了什么变化?首先,“想做”与“不能做”这两个冲突的念头是难以共存的,它们如果一起存在于你的脑子里的话就会不停地折磨你。当你被折磨了足够长的时间之后你的内心就会作出一个选择,是改变“想做”还是改变“不能做”。改变“想做”很简单,只要改成“不想做”就行了。而改变“不能做”则难多了,需要你“做到”这件事情。于是你作出了一个决定,放弃“想做”。改为“不想做”。最终你还是没有做成那件事情,但奇怪的是,你觉得你最终没有做是因为“不想做”,而不是因为“不能做”。这下你的理由就充分了,你就舒坦了,因为因不想做而不做某事,这是一个天经地义的理由。你不会承认自己是因为不能做所以不去做的。可惜,事实是你把自己给骗了,为什么呢?因为你“不想做”的原因正是因为发觉自己“不能做”。你“不想做”并不是因为真正的不想做或没兴趣做,而是对“不能做”的一个妥协。心理学上把类似这样的过程叫做“自利归因”。

简而言之自利归因就是把一件事情发生的原因归结为对自己有利的那种情况。用大白话说就是不能给自己难堪,不能让自己下不来台。功劳都给自己占,责任都给别人担。

遇到问题为什么应该自己动手

遇到问题寻找捷径为什么是很聪明的做法

我们在学习新东西,遇到困难的时候,为什么会放弃?因为我们下意识中会对所面临的困难以及成功后所得的收益作一个评估。当觉知到的困难到一定程度之后,我们的大脑便会想:既然很大可能最终失败,甚至看不到成功的可能,为什么要白费力气去学一通呢?还不如省省呢。这是一个聪明的经济决策,去权衡性价比应该是每个经济个体的原则。然而,这个决策笨就笨在,它把困难评估得过高了,因此决策的前提就弄错了。因为大部分知识都是需要等你掌握了之后才会“豁然开朗”、“柳暗花明的”,而在这之前你会觉得这东西太难了,完全没有头绪,摸不着门道。

遇到问题寻找捷径为什么只是小聪明

为了解决一个技术问题,你踏遍互联网,翻了若干教程、网站、书籍,最终解决了这个问题的同时还知道了以后遇到类似的问题该到哪儿最快最有效地找到参考,你还知道了哪些网站是寻找这个领域最有价值信息的地方,你还知道了哪些书是领域内最经典的书,说不定你在到处乱撞的过程中还会遇到其他若干意想不到的收益。

生活或工作中,很大程度上你遇到的每个问题都不是孤立的,既然你遇到了某问题,那么很大的可能性你以后还会遇到类似的问题。每次直接问到问题的答案的同时意味着你永远都要靠别人的大脑来获得答案。困难的路越走越容易,容易的路越走越难。

什么才是你的不可替代性和核心竞争力

个人的核心竞争力是他独特的个性知识经验组合。不在于你学的是什么技术,学得多深,IQ多少,而在于你身上有别人没有的独特的个性、背景、知识和经验的组合。如果这种组合,1,绝无仅有;2,在实践中有价值,3,具有可持续发展性,那你就具备核心竞争力。
长话短说,我相信以下的知识技能组合是具有相当程度的不可替代性的:

  • 专业领域技能:成为一个专业领域的专家,你的专业技能越强,在这个领域的不可替代性就越高。这个自是不用多说的。
  • 跨领域的技能:解决问题的能力,创新思维,判断与决策能力,Critical-Thinking,表达沟通能力,Open Mind 等等。
  • 学习能力:严格来说学习能力也属于跨领域的技能,但由于实在太重要,并且跨任何领域,所以独立出来。如何培养学习能力,到目前为止我所知道的最有效的办法就是持续学习和思考新知识。
  • 性格要素:严格来说这也属于跨领域技能,理由同上。一些我相信很重要的性格要素包括:专注、持之以恒、自省(意识到自己的问题所在的能力,这是改进自身的大前提)、好奇心、自信、谦卑(自信和谦卑是不悖的,前者是相信别人能够做到的自己也能够做到,后者是不要总认为自己确信正确的就一定是正确的,Keep an open mind)等等。

第三篇 跟波利亚学解题

跟波利亚学解题

一些故事

事实上,如果你仔细注意以下解题的过程,你也许会发现,所有的启发式思维方法(heuristics)实质上都是为了联想服务的,而联想则是为了从我们大脑的知识系统中提取出有价值的性质或定理,从而补上从条件到结论、从已知到未知之间缺失的链环。

一段历史

首先我们把需要求解的问题本身当成条件,从它推导出结论,再从这个结论推导出更多的结论,直到某一个点上我们发现已经出现了真正已知的条件。这个过程称为分析。有了这条路径,我们便可以从已知条件出发,一路推导到问题的解。

一些方法

  1. 时刻不忘未知量;
  2. 用特例启发思考;
  3. 反过来推导;
  4. 试错;
  5. 调整题目的条件(如,删除、增加、改变条件);
  6. 求解一个类似的题目;
  7. 列出所有可能跟问题有关的定理或性质;
  8. 考察反面,考察其他所有情况;
  9. 将问题泛化,并求解这个泛化后的问题;
  10. 下意识孵化法;
  11. 烫手山芋法;

一点思考

  1. 联想的法则
  2. 知识
  3. 好题目、坏题目
  4. 一个好习惯
  5. 练习,练习
  6. 启发法的局限性
  7. 总结的意义

锤子和钉子

如果你手里有一把锤子,所有东西看上去都像钉子。
正确的态度应该是:
手中有锤,心中无锤。

鱼是最后一个看到水的

如果你想钉一个钉子,所有东西看上去都像是锤子。之所以所有东西看起来都像钉子,是因为人倾向于在既有框架下去解决问题;更重要的是,在这个过程中很难觉察到框架约束的存在,正如鱼觉察不到水的存在一样。而这一切背后的本质原因则是:人是有很强的适应性的。
普通人遵守规则,牛人无视规则,伟人创造规则。

设计模式

把简单的事情搞复杂的人比比皆是,把复杂的事情搞简单的人凤毛麟角。不要觉得不用设计模式就不够好不够强大,以尽可能简单的方式完成任务才是王道。所谓无码胜有码,设计模式,也是如此。

语言之争

语言之争的原因之一就是人们容易在自己熟悉的语言框架下思考,并形成严重的偏见,只看到自己语言的好处,甚至于将并非好处的地方也觉知为好处。

语言的使用

一个程序员越是熟悉一门语言,越是容易为这门语言所累。避免思维被一门语言束缚的最好办法就是“学习其它语言”。

C++

  1. 学习C++的第一原则是什么?关注基本的(fundamental)概念和技术,而并非特定的语言特性,尤其不是C++中细枝末节的语言细节。
  2. 使用C++的第一原则是什么?将你的(pongba按:与语言无关的)设计理念(概念)直接映射为C++中的类或模板。 结论: Think out of the box.

知其所以然

我们要的不是相对论,而是诞生相对论的那个大脑。我们要的不是金蛋,而是下金蛋的那只鸡。

讲述思维过程而非结果有几个极其重要的价值:

  1. 内隐化:思维法则其实也是知识(只不过它是元知识——是帮助我们获得新知识的知识);是内隐的记忆。
  2. 跨情境运用:思维法则也是知识记忆,是问题解决策略。
  3. 对问题解的更多记忆提取线索。
  4. 包含了多得多的知识:记一个算法,就只有一个算法。
  5. 重在分析推理,而不是联想。
  6. 寻找该算法的原始出处。
  7. 原始的出处其实也未必就都推心置腹地和你讲得那么到位。
  8. 不仅学习别人的思路,整理自己的思路也是极其重要的。

为什么有必要知其所以然

在没有明白背后的证明之前,任何一个定理对你来说都是等价的——等价于背乘法口诀。去理解一个定理的证明会带来巨大的好处,首当其冲的好处就是你很难再忘掉它。这是一个树状的知识结构,越往上层走,需要记忆的节点就越少。

知道怎么做是从正确(高效)解法得到的,而知道为什么必须得那样做则往往是从错误(低效)的解法当中得到的。

康托尔、哥德尔、图灵 — 永恒的金色对角线

图灵的停机问题

Y Combinator

lambda calculus

递归的迷思

一次成功的尝试

不动点原理

铸造Y Combinator

哥德尔的不完备性定理

从哥德尔公式到Y Combinator

大道至简——康托尔的天才

神奇的一一对应

实数集和自然数集无法构成一一对应?!

对角线方法——停机问题的深刻含义

罗素悖论

希尔伯特第十问题结出的硕果

对角线方法——回顾

数学之美番外篇:快排为什么那样快

猜数字

称球

排序

信息论!信息论?

数学之美番外篇:平凡而又神奇的贝叶斯方法

前言

历史

拼写纠正

模型比较与奥卡姆剃刀

无处不在的贝叶斯

朴素贝叶斯方法

层级贝叶斯模型

贝叶斯网络

2014-11-07-XSLT_Tips

XSLT usage and performance tips

Eight tips for how to use XSLT efficiently:

  • Keep the source documents small. If necessary split the document first.
  • Keep the XSLT processor (and Java VM) loaded in memory between runs
  • If you use the same stylesheet repeatedly, compile it first.
  • If you use the same source document repeatedly, keep it in memory.
  • If you perform the same transformation repeatedly, don’t. Store the result instead.
  • Keep the output document small. For example, if you’re generating HTML, use CSS.
  • Never validate the same source document more than once.
  • Split complex transformations into several stages.

Eight tips for how to write efficient XSLT:

  • Avoid repeated use of “//item”.
  • Don’t evaluate the same node-set more than once; save it in a variable.
  • Avoid xsl:number if you can. For example, by using position().
  • Use xsl:key, for example to solve grouping problems.
  • Avoid complex patterns in template rules. Instead, use xsl:choose within the rule.
  • Be careful when using the preceding[-sibling] or following[-sibling] axes. This often indicates an algorithm with n-squared performance.
  • Don’t sort the same node-set more than once. If necessary, save it as a result tree fragment and access it using the node-set() extension function.
  • To output the text value of a simple #PCDATA element, use xsl:value-of in preference to xsl:apply-templates.

XSLT Best Practices

XSLT (Extensible Stylesheet Language Transformations) is a functional language for transforming XML documents into another file structure such as plain text, HTML, XML, etc. XSLT is available in multiple versions, but version 1.0 is the most commonly used version. XSLT is extremely fast at transforming XML and does not require compilation to test out changes. It can be debugged with modern debuggers, and the output is very easy to test simply by using a compare tool on the output. XSLT also makes it easier to keep a clear separation between business and display logic.

Uses

XSLT has numerous uses. XML is easy to generate and can easily be transformed to the desired layout of other systems. Many older EDI systems need to receive data in a fixed, flat file format. One such example of a fixed file format is the ABA file format used in the banking industry of Australia. XSLT can be used to transform your data source to a flat file format for another system to consume, and that same data source can then be used to transform the data into HTML for display in a web browser. In fact, it’s even possible to use XSLT to build an XSLT view engine for use with MVC to render content.

Another use for XSLT is creating dynamic documents in various formats such as Word, Excel, and PDF. Starting with Office 2003, Microsoft began supporting the WordML and ExcelML data formats. These data formats are XML documents that represent a Word document or an Excel spreadsheet. Data from a database can be easily transformed into either of these formats through the use of XSLT. In addition, the same data source can also be transformed into XSL-FO to create PDF documents.

Besides the two uses above, you may want to consider using XSLT whenever you are working with templates, when you are working with XML data, or when you are working with static data that doesn’t need to live in a database. An example of a template would be an email newsletter that gets sent out and is “mail-merged” with data from the database.

Of course there are times that you could use XSLT to accomplish a programming task, but it might not be the right choice. For instance, it might be easier to use LINQ to access data from an object hierarchy and then use a StringBuilder to build output rather than to use an XSLT to do the same thing. An XSLT might also not be appropriate for generating output if you need to do a large amount of string manipulation. Having to use certain string functions like replace or split are not as easy to accomplish in XSLT as they are in languages like C#.

Basics

Assuming that XSLT is the right solution for the task you are trying to accomplish, there are several basic things that a developer needs to be aware of. The first thing to remember is that XSLT is a functional language. Once a variable is set it cannot be changed. In order to change a value, you need to setup a template that you can call recursively. The following is an example of what that code might look like:

<xsl:template name="pad-left">
    <xsl:param name="totalWidth"/>
    <xsl:param name="paddingChar"/>
    <xsl:param name="value"/>
    <xsl:choose>
        <xsl:when test="string-length($value) &lt; $totalWidth">
            <xsl:call-template name="pad-left">
                <xsl:with-param name="totalWidth">
                    <xsl:value-of select="$totalWidth"/>
                </xsl:with-param>
                <xsl:with-param name="paddingChar">
                    <xsl:value-of select="$paddingChar"/>
                </xsl:with-param>
                <xsl:with-param name="value">
                    <xsl:value-of select="concat($paddingChar, $value)"/>
                </xsl:with-param>
            </xsl:call-template>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="$value"/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

The template above performs the equivalent function of the pad left function in .Net. The pad-left template takes in three parameters. It then checks to see if the length of the value passed in is less than the total length specified. If the length is less then the template calls itself again passing in the value passed to the function concatenated with the padding character and the desired length. This process is repeated until the value passed into the template is greater than or equal to the string length passed into the template.

Another important thing to know when working with XSLT is that namespaces affect how you select data from XML. For instance, let’s say you’re working with XML that starts with the following fragment:

<FMPXMLRESULT xmlns="http://www.filemaker.com/fmpxmlresult">

In order to select data from this XML document, you need to include a reference to the namespace(s) used in the XML document that you are consuming in your XSLT. For the example above you would do something like this:

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:msxsl="urn:schemas-microsoft-com:xslt"
    xmlns:fm="http://www.filemaker.com/fmpxmlresult"
    exclude-result-prefixes="msxsl fm">

<xsl:template match="fm:FMPXMLRESULT">
    <xsl:apply-templates select="fm:RESULTSET" />
</xsl:template>

The last area I would like to focus on is the use of templates. XSLT provides two techniques for accessing data. The push approach, as the name implies, pushes the source XML to the stylesheet, which has various templates to handle variable kinds of nodes. Such an approach makes use of several different templates and applies the appropriate template for a given node through the use of the xsl:apply-templates command. An example of this is as follows:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:template match="Orders">
        <html>
            <body>
                <xsl:apply-templates select="Invoice"/>
            </body>
        </html>
    </xsl:template>
    <xsl:template match="Invoice">
        <xsl:apply-templates select="CustomerName" />
        <p>
            <xsl:apply-templates select="Address" />
            <xsl:apply-templates select="City" />
            <xsl:apply-templates select="State" />
            <xsl:apply-templates select="Zip" />
        </p>
        <table>
            <tr>
                <th>Description</th>
                <th>Cost</th>
            </tr>
            <xsl:apply-templates select="Item" />
        </table>
        <p />
    </xsl:template>
    <xsl:template match="CustomerName">
        <h1><xsl:value-of select="." /></h1>
    </xsl:template>
    <xsl:template match="Address">
        <xsl:value-of select="." /><br />
    </xsl:template>
    <xsl:template match="City">
        <xsl:value-of select="." />
        <xsl:text>, </xsl:text>
    </xsl:template>
    <xsl:template match="State">
        <xsl:value-of select="." />
        <xsl:text> </xsl:text>
    </xsl:template>
    <xsl:template match="Zip">
        <xsl:value-of select="." />
    </xsl:template>
    <xsl:template match="Item">
        <tr>
            <xsl:apply-templates />
        </tr>
    </xsl:template>
    <xsl:template match="Description">
        <td><xsl:value-of select="." /></td>
    </xsl:template>
    <xsl:template match="TotalCost">
        <td><xsl:value-of select="." /></td>
    </xsl:template>
    <xsl:template match="*">
        <xsl:apply-templates />
    </xsl:template>
    <xsl:template match="text()" />
</xsl:stylesheet>

The pull approach on the other hand makes minimal use of xsl:apply-template instruction and instead pulls the xml through the transform with the use of the xsl:for-each and xsl:value-of instructions. Using the pull technique, the above template would look something like this:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:template match="Orders">
        <html>
            <body>
                <xsl:for-each select="Invoice">
                    <h1>
                        <xsl:value-of select="CustomerName" />
                    </h1>
                    <p>
                        <xsl:value-of select="Address" /><br />
                        <xsl:value-of select="City" />
                        <xsl:text>, </xsl:text>
                        <xsl:value-of select="State" />
                        <xsl:text> </xsl:text>
                        <xsl:value-of select="Zip" />
                    </p>
                    <table>
                        <tr>
                            <th>Description</th>
                            <th>Cost</th>
                        </tr>
                        <xsl:for-each select="Item">
                            <tr>
                                <td><xsl:value-of select="Description" /></td>
                                <td><xsl:value-of select="TotalCost" /></td>
                            </tr>
                        </xsl:for-each>
                    </table>
                    <p />
                </xsl:for-each>
            </body>
        </html>
    </xsl:template>
</xsl:stylesheet>

You can read more about these two approaches at http://www.xml.com/pub/a/2005/07/06/tr.html and http://www.ibm.com/developerworks/library/x-xdpshpul.html.

Best Practices

While XSLT is extremely fast and powerful, there are several rules to keep in mind in order to write quality code. They are as follows:

  • Avoid the use of the // near the root of the document especially when transforming very large XML document. The // selector selects nodes in the document from the current node that match the selection no matter where they are in the document. It is best to avoid using the // operator all together if possible. More scanning of the XML document is required which makes transforms take longer and makes them less efficient.
  • Avoid the use of very long xpath queries (i.e. more than a screen width long). It makes the XSLT logic difficult to read.
  • Set the indent attribute in the output declaration to off when outputting XML or HTML. Not only will this reduce the size of the file you generate, but it will also decrease the processing time.
  • Try to use template matching (push method) instead of named templates (pull method). Named templates are fine to use for utility functions like the padding template listed above. However, template matching will create cleaner and more elegant code.
    Make use of built in XSLT functions whenever possible. A good example of this is when you are trying to concatenate strings. One approach to accomplish this would be to utilize several xsl:value-of instructions. However, it is much cleaner to use the xsl concat() function instead.
  • If you are transforming a large amount of data through .Net code you should utilize the XmlDataReader and XmlDataWriter classes. If you try and use the XmlDocument class to read in your XML and the StringBuilder class to write out your XML you are likely to get an Out of Memory exception since data must be loaded in one continuous memory block.

Additional best practices can be found here:

http://www.xml.org//sites/www.xml.org/files/xslt_efficient_programming_techniques.pdf

XSLT Tips for Cleaner Code and Better Performance

Conclusion

There are many times to consider using XSLT. The language tends to be verbose and at times it can feel unnatural to program in if you are more accustomed to a procedural programming style. However, it is a flexible and powerful language that with a little time can be easy to pick up and learn. There are debugging and profiling tools available to make the development process easier. In addition, changes to an XSLT does not require compilation in order to test, which can easily be done by comparing output with a compare tool such as Araxis Merge.

XSLT Tips for Cleaner Code and Better Performance

On this page:

  • Avoid XSLT Named Templates; Use Template Match
  • Avoid xsl:for-each; Use Template Match
  • You don’t have to use xsl:element or xsl:attribute
  • Use the element name itself rather than xsl:element
  • Use the { } shorthand for writing values inside of attributes
  • Use template modes
  • Use in-built functions: concat()
  • Use in-built functions: boolean()
  • Use in-built functions: string()
  • Use in-built functions: number()
  • Use in-built functions: other
  • More tips

XSLT is a transformation language to convert XML from one format to another (or to another text-based output).

People seem to love or hate XSLT. Some find it hard to read or strange to get used to. Yet, it can be quite elegant when coded right. So this will be the first in a series of posts to show where it can be useful (and what its pitfalls/annoyances may be), how to make best use of XSLT, etc.

This first post looks at coding style in XSLT 1.0 and XPath 1.0.

I think some frustrations at this technology come from wanting to do procedural programming with it, whereas it is really more like a functional programming language; you define what rules to act against, rather than how to determine the rules (kind of).

For example, consider the following example where a named template may be used to create a link to a product:

<xsl:template name="CreateLink">
  <xsl:param name="product" />
  <xsl:element name="a">
    <xsl:attribute name="href">
      <xsl:value-of select="'/product/?id='" /><xsl:value-of select="normalize-space($product/@id)" />
    <xsl:value-of select="$product/name" />
  </xsl:element>
</xsl:template>

I have found the above to be a common way people initially code their XSLTs. Yet, the following is far neater:

<xsl:template match="product">
  <a href="{concat('/product/?id=', normalize-space(./@id))}">
    <xsl:value-of select="./@name" />
  </a>
</xsl:template>

Not only does such neater coding become easier to read and maintain, but it can even improve performance.

(Update: As Azat rightly notes in a comment below the use of ‘./’ is redundant. That is definitely true. I should have added originally that I tend to use that to help others in the team, especially those newer to XSLT to understand the context of which element your template is running under a bit more clearly.)

Lets look at a few tips on how this may be possible (a future post will concentrate on additional performance-related tips; the tips below are primarily on coding style):

Avoid XSLT Named Templates; Use Template Match

The first coding practice that leads to code bloat and hard to read XSLT is using named templates everywhere. Named templates give a procedural feel to coding. (You define templates with names, pass parameters as needed and do some stuff). This may feel familiar to most coders, but it really misses the elegance and flexibility of XSLT.

So, instead of this:

<xsl:template name="CreateLink">
  <xsl:param name="product" />
  <-- create the link here based on the product parameter -->
</xsl:template>

<-- The above would be called from elsewhere using this: -->
<xsl:call-template name="CreateLink"<>
  <xsl:with-param name="product" select="./product" />
</xsl:call-template>

Far neater would be this:

<xsl:template match="product">
  <-- create the link here based on the product parameter -->
</xsl:template>

<-- The above would be called from elsewhere using this: -->
<xsl:apply-templates select="./product" />

The above example doesn’t look like much on its own. When you have a real stylesheet with lots of template matches, (and modes, which we look at later) this gets a lot easier to read, and cuts a LOT of code, especially when calling/applying these templates.

(Of course, each tip has exceptions; named templates can be useful for utility functions. Sometimes XSLT extension objects can be useful for that too, depending on your parser and runtime requirements. A subsequent post on XSLT performance tips will cover that.)

Avoid xsl:for-each; Use Template Match

xsl:for-each is another programming construct that would appeal to many coders. But again, it is rarely needed. Let the XSLT processor do the looping for you (it has potential to be optimised further, too).

There are some instances or XSLT parsers that may perform a bit quicker using xsl:for-each because for-each avoids the XSLT processor having to determine which of possibly many matched templates is the suitable one to execute. However, matched templates that use modes can overcome those issues to most extents, and lend to highly elegant, reusable XSLT.

You don’t have to use xsl:element or xsl:attribute

You can use xsl:element and xsl:attribute, but it leads to very bloated code.

Here are a few examples of what you can do instead. In each example we will just assume we are working with some XML that represents some kind of product (it is not important what this structure is for this discussion).

Use the element name itself rather than xsl:element

Instead of

<xsl:element name="p">
  <xsl:value-of select="$product/name" />
</xsl:element>

This is a lot cleaner to read:

<p>
  <xsl:value-of select="$product/name" />
</p>

Sometimes I prefer this:

<p><xsl:value-of select="$product/name" /></p>

Use the { } shorthand for writing values inside of attributes

Using xsl:value-of for many attributes can get verbose very quickly. There is more code to read. So the code just looks uglier and more bloated. For attributes only then, with most XSLT parsers, you can use the shorthand { as a replacement for .

In between { and } you just put in your normal select expression.

So, instead of

<h3>
    <xsl:attribute name="class">
        <xsl:value-of select="$product/@type" />
    </xsl:attribute>
    <xsl:value-of select="$product/name" />
</h3>

This is a lot cleaner to read:

<h3 class="{$product/name}">
  <xsl:value-of select="$product/name" />
</h3>

Or, instead of

<xsl:element name="img">
    <xsl:attribute name="src" select="$product/image/@src" />
    <xsl:attribute name="width" select="$product/image/@width" />
    <xsl:attribute name="height" select="$product/image/@height" />
    <xsl:attribute name="alt" select="$product/image" />
    <xsl:attribute name="class" select="$product/@type" />
</xsl:element>

This is a lot cleaner to read:

<img
    src="{$product/image/@src}"
    width="{$product/image/@width}"
    height="{$product/image/@height}"
    alt="{$product/image}"
    class="{$product/@type}"
    />

The above is only put onto multiple lines for this web page. In a proper editor sometimes a one-liner is even easier to read:

<img src="{$product/image/@src}" width="{$product/image/@width}" height="{$product/image/@height}" alt="{$product/image}" class="{$product/@type}" />

The above is also looking a lot like some templating languages now, and you might see why I am wondering why there are so many proprietary ones people have to learn, when XSLT is an open, widely supported, standard with transferable skills!

The above also doesn’t show how clean the code would really be, because someone using xsl:attribute is likely to use xsl:element as well, so really we should compare the legibility of this:

<xsl:element name="h3">
    <xsl:attribute name="class">
        <xsl:value-of select="$product/@type" />
    </xsl:attribute>
    <xsl:value-of select="$product/name" />
</xsl:element>

… versus this:

<h3 class="{$product/name}">
    <xsl:value-of select="$product/name" />
</h3>

Use template modes

Often, you will want to use a template match for totally different purposes. Rather than pass unnecessary parameters or resort to different named templates, a mode attribute on the template can do the trick.

For example, suppose you are showing an order history for some e-commerce site. Suppose you want a summary of orders at the top that anchor to the specific entries further down the page.

You can have more than one template have the same match, and use mode to differentiate or indicate what they are used for.

Consider this example. First, here is a starting point in the XSLT. The idea is to reuse the Orders element, one for summary purpose, the next for details.

<!-- starting point -->
<xsl:template match="/">
    <h1>Order summary</h1>
    <h2>Summary of orders</h2>
    <p><xsl:apply-templates select="./Orders" mode="summary-info" /></p>
    <h2>Table of orders</h2>
    <xsl:apply-templates select="./Orders" mode="order-summary-details" />
</xsl:template>

Next, we match Orders with the summary-info mode:

<xsl:template match="Orders" mode="summary-info">
    <xsl:value-of select="concat(count(./Order), ' orders, from ', ./Order[1]/@date, ' to ', ./Order[last()]/@date)" />
</xsl:template>

We can also match Orders for the order-summary-details mode. Note how the variable has also re-used the other mode to get the summary for the table’s summary attribute.

<xsl:template match="Orders" mode="order-summary-details">
    <xsl:variable name="summary">
        <xsl:apply-templates select="." mode="summary-info" />
    </xsl:variable>
    <table summary="{normalize-space($summary)}">
        <thead>
            <tr>
                <th scope="col">Order number</th>
                <th scope="col">Amount</th>
                <th scope="col">Status</th>
            </tr>
        </thead>
        <tbody>
            <xsl:apply-templates select="./Order" mode="order-summary-details" />
        </tbody>
    </table>
</xsl:template>

Note how the same mode name can be used for additional matches. This is a neat way to keep related functionality together:

<xsl:template match="Order" mode="order-summary-details">
    <tr>
        <td><a href="/order/details/?id={./@id}"><xsl:value-of select="./@id" /></a></td>
        <td><xsl:value-of select="./amount" /></td>
        <td><xsl:value-of select="./status" /></td>
    </tr>
</xsl:template>

In many real XSLTs I have written these modes can be re-used many times over. They help with performance, while maintaining this elegance/reduction of code because the XSLT processor can use that to narrow down which possible template matches to select from when looking for the one to execute.

The use of modes (and other features such as importing other XSLTs and overriding moded templates) has allowed us to create multiple sub-sites in parallel (e.g. an ecommerce site that sells books, entertainment products (CDs, DVDs, computer games, etc) that all run off the same XSLTs with some minor customisation in each sub-site. Although the actual data is different, they fall into the same XML structure — they are products after all! — thus making the XSLTs highly reusable. A future post will describe arranging XSLTs in an almost object-oriented fashion).

Use in-built functions: concat()

The concat() function allows you to remove unnecessary and excessive uses of statements one after the other (and with the accompanying xsl:text /xsl:text type of trick to get a white space in there).

Code looks easier to read, in most cases, and typically performs better too.

Example:

Instead of this:

<xsl:value-of select="$string1" /><xsl:text> </xsl:text><xsl:value-of select="$string2" />

This is much cleaner to read:

<xsl:value-of select="concat($string1, ' ', $string2)" />

Or,

Instead of this:

<a>
    <xsl:attribute name="href">
        <xsl:value-of select="$domain" />/product/?<xsl:value-of select="$someProductId" />
    </xsl:attribute>
    <xsl:value-of select="$productDescription" />
</a>

This is much cleaner to read:

<a href="{concat($domain, '/product/?', $someProductId}">
    <xsl:value-of select="$productDescription" />
</a>

Storing a string resulting from a concat into a variable is also efficient from a performance point of view (storing node-sets does not cache the result, as in most DOM and XSLT implementations, node-sets are live collections. More on that in a future post).

(Update: Azat notes in a comment below that the above href attribute can be even further simplified into this: href=”{$domain}/product/?{$someProductId}”.)

Use in-built functions: boolean()

How many times have we seen code like this:

<xsl:if test="$new = 'true'"> ... </xsl:if>

While it works, it is not ideal using string comparison, especially if this kind of test is going to be repeated in a template.

It would be better to create a variable using this kind of syntax:

<xsl:variable name="isNew" select="boolean($new = 'true')" />

Then, in your code, when you need to use it, you can do things like:

<xsl:if test="$isNew"> ... </xsl:if>

or

<xsl:if test="$isNew = true()"> ... </xsl:if>

or

<xsl:if test="$isNew = false()"> ... </xsl:if>

or

<xsl:if test="not($isNew)"> ... </xsl:if>

These above variations are down to style/preference, but is better from a coding perspective than constant testing of strings. (Sometimes the calculation of what true or false means may require testing many values, such as true, True, 1, Y, etc. This can all be hidden away in that one variable declaration, and the rest of the code is unchanged.)

(Update: Azat rightly notes in a comment below that the variable declaration can be made smaller by omitting the actual boolean function so it is just this: . I find the explicit use of boolean can aid with readability, especially for those new to XSLT so might be useful to retain under such situations.)

Use in-built functions: string()

Instead of this:

<xsl:variable name="mystring">my text</variable>

Consider this:

<xsl:variable name="mystring" select="'my text'" />

Or this:

<xsl:variable name="mystring" select="string('my text')" />

Or, more importantly, instead of this:

<xsl:variable name="bookTitle"><xsl:value-of select="./title" /></xsl:variable>

Consider this:

<xsl:variable name="mystring" select="string(./title)" />

Why?

Code is cleaner to read.

But it is also more optimal; casting to a string instead of storing the node will result in the variable value being cached in most XSLT processors, rather than being re-evaluated each time it is accessed. (XML nodes are live collections according to W3C which means they may change. Hence references to nodes require evaluation each time they are accessed.)

Use in-built functions: number()

For similar reasons as above to use string(), number() should be used too.

Use in-built functions: other

XPath functions such as starts-with(), string-length() are handy.

For example, it is common to see code to test for the presence of strings by testing if a variable equals the empty string (”). But as most programmers should know, it is more efficient to test for the presence of a string by testing its length. In XPath expressions you can use string-length() function for this.

For more information and full list of XPath functions, consider the following:

  • The XPath 1.0 Specification from the W3C
  • The MSDN XPath reference from Microsoft (Same as the W3C information, of course, but has useful examples)
  • Documentation from Stylus (also has some useful examples)

More tips

The above is about XPath 1.0 and XSLT 1.0. Even with the above tips, some XSLT can require more code than ideal, which XSLT 2.0 and XPath 2.0 help to address. The features in those are very useful for sure, but not as widely implemented as 1.0. My experiences are almost entirely in 1.0 which we use in live, production/run-time environments.

Here are a some additional useful tips:

  • There are Monsters in My Closet or How Not to Use XSLT by R. Alexander Milowski from the School of Information Management and Systems, Berkeley
  • XSLT by Example is a blog of XSLT examples by Miguael deMelo

Do you have any useful tips to augment/improve the above? Let me know and I will add them above

Be Sociable, Share!