|
|
|
[
Permlink
| « Hide
]
Sascha Weinreuter - 29 Jul 06 22:08
formatting
While not a showstopper, I think it's a rather odd behavior, at least from the API-user's point of view. Is this by design or is there a chance that this will change for the final 6.0?
It would not be the end of the world if it stays like it is, but then the behavior should be documented. This behaviour is by design.
There is a contract stating that text obtained from PSI should be the same as the file text. I.e. following should be true: document.getText().equals(psiFile.getText()) And, moreover, this should also hold for any PSI element, i.e. must be true for any PSI element. I agree that is applies to PsiElements, but not necessarily to ASTNodes (even if from the internal implementation's point of view they are the same). The end result of this is that a language needs to know into which context it is injected into (if injected at all):
Suppose I have a Token INTEGER_LITERAL: If injected into an XML attribute, the text can be "1234" or "1234": In both cases, the text passed to my lexer is the same - which is good. But how am I supposed to deal with that when I want to calculate the literal's value? There doesn't even seem to be any utility function that could help me to decode that myself. Suggestion: Add a method getDecodedText() (or similar) to ASTNode and/or ASTWrapperPsiElement that at least provides a convenient solution for the case when I need to process the text myself. Stupid JIRA formatting: Of course I meant "& #x31;& #x32;& #x33;& #x34;" (without the spaces)
Ok, here's another problem:
There's a difference whether an element is part of the prefix/suffix of an injected fragment or not. While elements that are part of e.g. a String literal return the escaped text, elements from the prefix/suffix return the unescaped text. Even though this appears logical at first glance, this is kind of a showstopper because this makes it impossible to distinguish whether to manually decode the text or not. (e.g. through getContainingFile().getContext() instanceof PsiLiteralExpression). I see the following possibilities to address this (in order of preference):
Please respond ASAP. Thanks. All I can do in the meantime is to refer you to the highly obscured and implementation tied method
com.intellij.psi.impl.source.tree.injected.InjectedLanguageUtil#isInInjectedLanguagePrefixSuffix which of course will be changed in the future, and so on, so on. Overall, things like prefix/suffix handling should be reviewed, since now a single quote (' or ") being typed into injected Javascript language breaks all prefix/suffix things because it makes all text after quote a part of the single long string literal spanning all injected text incuding suffix. Well, that's good enough for me for the moment. Thanks a lot for the hint, this at least helps me to deal with the issue in a new language that is explicitly meant to be injected into strings.
Hmm, that still requires the language to be aware that it is potentially injected. I was looking for a more transparent solution, but I guess this would be too hard because it violates certain assumptions about PSI & text. But the new method is a good start anyway. Thanks.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||